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ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES 

RELATED APPLICATIONS 

For U.S. national stage purposes, this application is a continuation- 
5 in-part of copending U.S. application Serial No. 08/695,191, filed August 
7, 1 996 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES. This application is also 
continuation-in-part of copending U.S. application Serial No. 08/682,080, 

10 filed July 15, 1996 by GYULA HADLACZKY and ALADAR SZALAY, 

entitled >l/?7/F/C/>4^ CHROMOSOMES, USES THEREOF AND METHODS 
FOR PREPARING ARTIFICIAL CHROMOSOMES, and is also a 
continuation-in-part of copending U.S. application Serial No. 08/629,822, 
filed ApriMO, 1 996 by GYULA HADLACZKY and ALADAR SZALAY, 

15 entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS 
FOR PREPARING ARTIFICIAL CHROMOSOMES. 

For international purposes, the benefit of priority to each of these 
application is claimed and the subject matter of that application is 
incorporated herein in its entirety. 

20 U.S. application Serial No. 08/695,191 is a continuation-in-part of 

U.S. application Serial No. 08/682,080 and also is a continuation-in-part 
of U.S. application Serial No. 08/629,822. U.S. application Serial No. 
08/682,080 is a continuation-in-part of U.S. application Serial No. 
08/629,822. 

25 This application is related to U.S. application Serial No. 

07/759,558, now U.S. Patent No. 5,288,625, is related to U.S. 
application Serial No. 08/734,344, filed October 21, 1996, and is related 
to allowed U.S. application Serial No. 08/375,271, filed 1/19/95, which 
is a continuation of U.S. application Serial No. 08/080,097, filed 6/23/93 
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■ which is a continuatioh of uIs/appliMtion Seri^ 

6/3/92/ which is a continuation of U.s/ application Serial No. 
- - ^ 07/521 ,073, filed 5/9/90. ' " - " 

- — - - - - -To the extent pernriittedrthe suBje-ct^^^^^^^ ^ch~of U.sT 

5 application Serial Nos. 08/734,344, 08/695,191, 08/682,080, 
, 08/629,822, 08/375,271, 08/080,097; 07/892,487, and 07/521 ,073, 

and U.S. Patent No. 5,288,625.is incorporated in its entirety by 
reference thereto. ' 

FIELD OF THE INVENTION 

10 The present invention relates to methods for preparing cell lines 

that contain artificial chromosomes, methods for isolation of the artificial 
chromosomes, targeted insertion of heterologous DNA into the 
: c chromosomes to selected cells and tissues 

and methods for isolation and large-scale production of the 
15 chromosomes. Also provided are cell lines for use in the methods, and 
cell lines and chromosomes produced by the methods. Further provided 
are cell-based methods for production of heterologous proteins, gene 
therapy methods and niethods of generating transgenic animals, 
particularly non-human transgenic animals, that use artificial 
20 chromosomes. 

BACKGROUND OF THE INVENTION 

Several viral vectors, non-viral, and physical delivery systems for 
gene therapy and recombinant expression of heterologous nucleic acids 
have been developed (see, e^, Mitani gj aL (1 993) Trends Biot^rh 
25 11:162-166]. The presently available systems, however, have numerous 

limitations, particularly where persistent, stable, or controlled gene 
expression is required. These limitations include: (1) size limitations 
'^^^^^s® ^^^''e 's a 'imit, generaHy on order of about ten kilobases [kB], at 
most, to the size of the DNA insert [g ne] that can be accepted by viral 
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vectors, whereas a number of mammalian genes of possible therapeutic 
importance are well above this limit, especially if all control elements are 
included; (2) the inability to specifically target integration so that random 
integration occurs which carries a risk of disrupting vital genes or cancer 
5 suppressor genes; (3) the expression of randomly integrated therapeutic 
genes may be affected by the functional compartmentalization in the 
nucleus and are affected by chromatin-based position effects; <4) the 
copy number and consequently the expression of a given gene to be 
integrated into the genome cannot be controlled. Thus, improvements in 
O gene delivery and stable expression systems are needed [see, e.g. . 
Mulligan (1993) Science 260:926-932], 

In addition, safe and effective vectors and gene therapy methods 
should have numerous features that are not assured by the presently 
available systems. For example, a safe vector should not contain DNA 
5 elements that can promote unwanted changes by recombination or 
mutation in the host genetic material, should not have the potential to 
initiate deleterious effects in cells, tissues, or organisms carrying the 
vector, and should not interfere with genomic functions. In addition, it 
would be advantageous for the vector to be non-integrative, or designed 

20 for site-specific integration. Also, the copy number of therapeutic 

genels) carried by the vector should be controlled and stable, the vector 
should secure the independent and controlled function of the introduced 
gene(s); and the vector should accept large (up to Mb size) inserts and 
ensure the functional stability of the insert. 

25 The limitations of existing gene delivery technologies, however, 

argue for the development of alternative vector systems suitable for 
transferring large [up to Mb size or larger] genes and gene complexes 
together with regulatory elements that will provide a safe, controlled, and 
persistent expression of the therapeutic genetic material. 
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; ^ present time, none of the available vectors fulfill all these 
requirements. Most of these characteristics, however, are possessed by 
chromosomes. Thus, an artificial chromosome would be an ideal vector 
_ ^ ^ _ -for-gene therapy,-as well-as for stable, high-level; contr-Slled pToduction ' 
5 of gene products that require coordination of expression of numerous 
genes or that are encoded by large genes, and other uses. Artificial 
chromosomes for expression of heterologous genes in yeast are 
available, but construction of defined mammalian artificial chromosomes 
has not been achieved. Such construction has been hindered by the lack 
10 of an isolated, functional, mammalian centromere and uncertainty 

regarding the requisites for its production and stable replication. Unlike 
In yeast, there are no selectable genes in close proximity to a mammalian 
centromere, and the presence of long runs of highly repetitive pericentric 
^ heterochromatic DNA makes the isolation of a mammalian centromere 
15 using presently available methods, such as chromosome walking, 
virtually impossible. Other strategies are required for production of 
mammalian artificial chromosomes, and some have been developed. For 
example, U.S. Patent No. 5,288,625 provides a cell line that contains an 
. artificial chromosome, a minichromosome, that is about 20 to 30 
20 megabases. Methods provided for isolation of these chromosomes, 
however, provide preparations of only about 10-20% purity. Thus, 
development of alternative artificial chromosomes and perfection of 
isolation and purification methods as well as development of more 
versatile chromosomes and further characterization of the 
25 minichromosomes is required to realize the potential of this technology. 

Therefore, it is an object herein to provide mammalian artificial 
chromosomes and methods for introduction of foreign DNA into such 
chromosomes. It is also an object her in to provide methods of isolation 
and purification of the chromosomes. It is also an object herein to 
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provide methods for introduction of the mammalian artificial chromosome 
into selected cells, and to provide the resulting cells, as well as 
transgenic non-human animals, birds, fish and plants that contain the 
artificial chromosomes. It is also an object herein to provide methods for 
5 gene therapy and expression of gene products using artificial 

chromosomes. It is a further object herein to provide methods for 
constructing species-specific artificial chromosomes de novo. Another 
object herein is to provide methods to generate de novo mammalian 
artificial chromosomes. 

10 SUMMARY OF THE INVENTION 

Mammalian artificial chromosomes [MACs] are provided. Also 
provided are artificial chromosomes for other higher eukaryotic species, 
such as insects, birds, fowl and fish, produced using the MACS and 
methods provided herein. Methods for generating and isolating such 

15 chromosomes are provided. . Methods using the MACs to construct;- 

artificial chromosomes from other species, such as insect, bird, fowLand 
fish species are also provided. The artificial chromosomes are fully 
functional stable chromosomes. Two types of artificial chromosomes are 
provided. One type, herein referred to as SATACs [satellite artificial 

20 chromosomes or satellite DNA based artificial chromosomes (the terms 
are used interchangeably herein)] are stable heterochromatic 
chromosomes, and the other type are minichromosomes based on 
amplificiation of euchromatin. 

Artificial chromosomes provide an extra-genomic locus for targeted 

25 integration of megabase [Mb] pair size DNA fragments that contain single 
or multiple genes, including multiple copies of a single gene operatively 
linked to one promoter or each copy or several copies linked to separate 
promoters. Thus, methods using the, MACs to introduce the genes into 
cells, tissues, and animals, as well as species such as birds, fowl, fish 
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and plants, are also provided. The artificial chromosomes, with integrated 
heterologous DNA may be used in methods of geHe therapy, in methods 

- ; of production pf gene products, particularly products that fequire' " 
__l^- _^expression-of-multigenic-biosynthetic pathways, and also are ihteWed-fo 

5 delivery into the nuclei of germline cells, such as embryo-derived stem 
cells [ES cells], for production of transgenic (non-human) animals, birds, 
fowl and fish. Transgenic plants, including monocots and dicots, are 
also contemplated herein. 

Mammalian artificial chromosomes provide extra-genomic specific 
10 integration sites for introduction of genes encoding proteins of interest 
and permit megabase size DNA integration so that, for example, genes 
encoding an entire metabolic pathway or a very large gene, such as the 
cystic fibrosis [CF; -250 kb] genomic DNA gene, several genes, such as 
multiple genes encoding a series of antigens for preparation of a • 
15 - multivalent vaccine, can be stably introduced into a cell. Vectors for 
targeted introduction of such genes, including the tumor suppressor 
genes, such as p53, the cystic fibrosis transmembrane regulator cDNA 
[CFTR], and the genes for anti-HIV ribozymes, such as an anti-HIV gag 
ribozyme gene, into the artificial chromosomes are also provided. 
20 The chromosomes provided herein are generated by introducing 

heterologous DNA that includes DNA encoding one or multiple selectable 
marker(s) into cells, preferably a stable cell line, growing the cells under 
selective conditions, and identifying from among the resulting clones 
those that include chromosomes with more than one centromere and/or 
25 fragments thereof. The amplification that produces the additional 

centromere or centromeres occurs in cells that contain chromosomes in 
which the heterologous DNA has integrated hear th c ntromere in the 

- -p ricentric-r gion of the chromosome: The' selected^cldhal cellOf^ 

used to g nerate artificial chromosomes; 
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Although non-,argeted introduction of DNA, which results in some 
frequency of integration into appropriate loci, targeted introduction is 
preferred. Hence, in prefen-ed en,bodiments. the DNA with the 
_ selectable marker the, is introduced into cells to initiate generation o, 
art,f,c,al chromosomes includes sequences that target it to the an 
amplifiable region, such as the pericentric region, heterochromatin, and 

™pm^''°''^°''''°''"~^- -^- --P- vectors, such as 
PTEMPUD and pHASPUD (provided herein,, which include such DNA 
specfic for mouse satellite DNA and human Satellite DNA, respectively 
10 are provided. The plasmid pHASPUD is a derivative of pTEMPUD that 
contains human satellite DNA sequences that specifically target human 
chromosomes. Preferred targeting sequences include mammalian 
nbosomel RNA (rRNA) gene sequences (referred to herein as rDNA) 
Which target the heterologous DNA to integrate into the rDNA regio; of 
those ch,„„„,<,^^^^^,^^^^^.^^^^^ For example, vectors, such as 
PTERPUD, Which include mouse rDNA, are provided. Upon integratiin 

■mo existing chromosomes in the cells, these vectors can induce the 
amplification that results in generation Of additional centromeres 

^h^mosomes are generated by culturing the cells with 
20 the multicentric, typically dicentric, chromosomes under conditions ' 
Whereby the chromosome breaks to form a minichromosome and 
formerly dicentric chromosome. Annong the MACs provided herein are 
the SATACs. which are primarily made up of repeating units of short 
satellite DNA and are nearly fully heterochromatio, so that without - 
25 .nsertion of heterologous or foreign DNA. the chromosomes preferably 
contain no genetic information or contain only non-protein-encoding gene 
sequences such as rDNA sequences. They can thus be used as "safe- 
vectors for delivery of DNA to mammalian hosts because they do not 
contain any potentially harmful g nes. The SATACs are generated not 
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from the minlchromosome fragment as, for example. In U.S. Patent No 
5,288,625, but from the fragment of the formerly dicentric chromosome. 
In addition, methods for generating euchromatic minichromosomes 
—and- the-use -thereof-are-also provided herein. -Methods for generating- " ^ 
5 one type of MAC, the minlchromosome, previously described in U.S. 
Patent No. 5,288,625, and the use thereof for expression of 
heterologous DNA are provided. In a particular method provided herein 
for generating a MAC, such as a minlchromosome, heterologous DNA 
that includes mammalian rDN A and one or more selectable marker genes 
10 is introduced into cells which are then grown under selective conditions. 
Resulting cells that contain chromosomes with more than one centromere 
are selected and cultured under conditions whereby the chromosome 
breaks to form a minichromosome and a formerly multicentric (typically 
dicentric) chromosonne from which the minichromosome was released' 
15; Cell lines containing the minichromosome and the use thereof for 

cell fusion are also provided. In one embodiment, a cell line containing 
the mammalian minichromosome is used as recipient cells for donor DNA 
encoding a selected gene or multiple genes. To facilitate integration of 
the donor DNA into the minichromosome, the recipient cell line preferably 
20 contains the minichromosome but does not also contain the formerly 

dicentric chromosome. This may be accomplished by methods disclosed 
herein such as cell fusion and selection of cells that contain a 
minichromosome and no formerly dicentric chromosome. The donor DNA 
is linked to a second selectable marker and is targeted to and integrated 
25 into the minichromosome. The resulting chromosome is transferred by 
cell fusion into an appropriate recipient cell line, such as a Chinese 
hamster cell line [CHO]. After large-scale production of the cells carrying 
th engineer d chromosome; the chromosome is isiolat d. In particular, 
m taphase chromosomes are obtained, such as by addition of colchicine. 
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and they are purified from the cell lysate. These chromosomes are used 
for cloning, sequencing and for delivery of heterologous DNA into cells. 

Also provided are SATACs of various sizes that are formed by 
repeated culturing under selective conditions and subcloning of cells that 
5 contain chromosornes produced from the formerly dicentric 

chromosomes. The exemplified SATACs are based on repeating DNA 
units that are about 15 Mb [twa — 7.5 Mb blocks]. The repeating DNA 
unit of SATACs formed from other species and other chromosomes may 
vary, but typically would be on the order of about 7 to about 20 Mb. 

10 The repeating DNA units are referred to herein as megareplicons, which 
in the exemplified SATACs contain tandem blocks of satellite DNA 
flanked by non-satellite DNA, including heterologous DNA and non- 
satellite DNA. Amplification produces an array of chromosome segments 
[each called an amplicon] that contain two inverted megareplicons 

15 bordered by heterologous ["foreign"] DNA. Repeated cell fusion, grovyth 
on selective medium and/or BrdU [5-bromodeoxyuridinel treatment or- 
other treatment with other genome destabilizing reagent or agent, such 
as ionizing radiation, including X-rays, and subcloning results in cell lines 
that carry stable heterochromatic or partially heterochromatic 

20 chromosomes, including a 1 50-200 Mb "sausage" chromosome, a 500- 
1000 Mb gigachromosome, a stable 250-400 Mb megachromosome and 
various smaller stable chromosomes derived therefrom. These 
chromosomes are based on these repeating units and can include 
heterologous DNA that is expressed. 

25 Thus, methods for producing MACs of both types {i.e., SATACS 

and minichromosomes) are provided. These methods are applicable to 
the production of artificial chromosomes containing centromeres derived 
from any higher eukaryotic cell, including mammals, birds, fowl, fish, 
insects and plants. 
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- ■ * ; The resulting chromosomes tikn be purified by methods provided 
herein to provide vectors for introduction of heterologous DNA into 
selected cells for production of the gene product(s) encoded by the 
- heterologous DNA7 for productidiT of trarTsgeniclhon^hurnarir animals, 
5 birds, fowl, fish and plants or for gene therapy. 

In addition, methods and vectors for fragmenting the 
minichromosomes and SATACs.are provided. Such methods and vectors 
can be used for in vivo generation of smaller stable artificial 
chromosomes. Vectors for chromosome fragmentation are used to 
10 produce an artificial chromosome that contains a megareplicon; a 

centromere and two telomeres and will be between about 7.5 Mb and 
about 60 Mb, preferably between about 10 Mb- 15 Mb and 30-50 Mb. 
As exemplified herein, the preferred range is between about 7.5 Mb and 
' 50 Mb. Such artificial chromosomes may also be produced by other " 
15 methods. 

Isolation of the 1 5 Mb (or 30 Mb amplicon containing two 15 Mb 
inverted repeats] or a 30 Mb or higher multimer, such as 60 Mb, thereof 
should provide a stable chromosomal vector that can be manipulated in 
vitro. Methods for reducing the size of the MACs to generate smaller 
20 stable self-replicating artificial chromosomes are also provided. 

Also provided herein, are methods for producing mammalian 
artificial chromosomes, including those provided herein, in vitro, and the ■ 
resulting chromosomes. The methods involve />? v/tro assembly of the 
structural and functional elements to provide a stable artificial 
chromosome. Such elements include a centromere, two telomeres, at 
least one origin of replication and filler heterochromatin, e.g . satellite 
DNA. A selectable marker for subsequent selection is also generally 
included. Th se specific DNA ef rherits may b obtain "d from the 
artificial chromosomes provided herein such as those that have been 
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generated by the introduction of heterologous DNA into cells and the 
subsequent amplification that leads to the artificial chromosome, 
particularly the SATACs. Centromere sequences for use in the />? wfro 
construction of artificial chromosomes may also be obtained by 
5 employing the centromere cloning methods provided herein. In preferred 
embodiments, the sequences providing the origin of replication, in 
particular, the megareplicator, are derived from rDNA. These sequences 
preferably include the rDNA origin of replication and amplification 
promoting sequences. 

10 Methods and vectors for targeting heterologous DNA into the 

artificial chromosomes are also provided as are methods and vectors for 
fragmenting the chromosomes to produce smaller but stable and self- 
replicating artificial chromosomes. 

The chromosomes are introduced into cells to produce stable 

15 transformed cell lines or cells, depending upon the source of the cells. 
Introduction is effected by any suitable method including, but notyimited 
to electroporation, direct uptake, such as by calcium phosphate 
precipitation, uptake of isolated chromosomes by lipofection, by microcell 
fusion, by lipid-mediated carrier systems or other suitable method. The 

20 resulting cells can be used for production of proteins in the cells. The 
chromosomes can be isolated and used for gene delivery. Methods 
for isolation of the chromosomes based on the DNA content of the 
chromosomes, which differs in MACs versus the authentic 
chromosomes, are provided. Also provided are methods that rely on 

25 content, particularly density, and size of the MACs, 
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' These artificial chromosomes can be used in gene therapy, gene 
product production systems, production of hurrianized genetically 

: - transformed animal organs, production of trahsgeriib plants and animais 

_ _ — (non-human)-including-mammals7 birdsrfowl,-fislT; ihverteiarate^,"" 

5 vertebrates, reptiles and insects, any organism or device that would 
employ chromosomal elements as information storage vehicles, and also 
for analysis and study of centromere function, for the production of 
artificial chromosome vectors that can be constructed in vitro , and for 
the preparation of species-specific artificial chromosomes. The artificial 
10 chromosomes can be introduced into cells using microinjection, cell 

fusion, microcell fusion, electroporation, nuclear transfer, electrofusion, 
projectile bombardment, nuclear transfer, calcium phosphate 
precipitation, lipid-mediated transfer systems and other such methods. 
Cells particularly suited for use with the artificial chromosomes include,'. 

15 but are not limited to plant cells, particularly tomato, arabidopsis, and 

others, insect cells, including silk worm cells, insect larvae, fish, reptiles, 
amphibians, arachnids, mammalian cells, avian cells, embryonic stem 
cells, haematopoietic stem cells, embryos and cells for use in methods of 
genetic therapy, such as lymphocytes that are used in methods of adop- 

20 tive immunotherapy and nerve or neural cells. Thus methods of pro 

ducing gene products and transgenic (non-human) animals and plants are 

provided. Also provided are the resulting transgenic ariimals and plants. 

Exemplary cell lines that contain these chromosomes are also 
provided. 

25 Methods for preparing artificial chromosomes for particular species 

and for cloning centromeres are also provided. For example,. two 
exemplary methods provided for gen rating artificial chromosomes for 
"se in different species are as follows". "First, the'methods hermn may be 
applied to different species. Second, means for generating species- 
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specific artificial chromosomes and for cloning centromeres are provided. 
In particular/ a method for cloning a centromere from an animal or plant 
is provided by preparing a library of DMA fragments that contain the 
genome of the plant or animal and introducing each of the fragments 
5 into a mammalian satellite artificial chromosome [SATAC] that contains a 
centromere from a species, generally a mammal, different from the 
selected plant or animal, generally a non-mammal, and a selectable 
marker. The selected plant or animal is one in which the mammalian 
species centromere does not function. Each of the SATACs is 

10 introduced into the cells, which are grown under selective conditions, 
and cells with SATACs are identified. Such SATACS should contain a 
centromere encoded by the DNA from the library or should contain the 
necessary elementis for stable replication in the selected species. 

Also provided are libraries in which the relatively large fragments 

15 of DNA are contained on artificial chromosomes. 

Transgenic (non-human) animals, invertebrates and vertebrates, 
plants and insects, fish, reptiles, amphibians, arachnids, birds, fowl, and 
mammals are also provided. Of particular interest are transgenic (non- 
human) animals and plants that express genes that confer resistance or 

20 reduce susceptibility to disease. For example, the transgene may - 

encode a protein that is toxic to a pathogen, such as a virus, bacterium 
or pest, but that is not toxic to the transgenic host. Furthermore, since 
multiple genes can be introduced on a MAC, a series of genes encoding 
an antigen can be introduced, which upon expression will serve to 

25 immunize [in a manner similar to a multivalent vaccine] the host animal 

against the diseases for which exposure to the antigens provide immunity 
or some protection. 
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jA^^^^^^^^ '"tei'est are transgenic (non-human) animals that serve as 

rriodels of certain diseases a^^ disorders for use in studying the disease 
: and developing therapeutic treatments and cures thereof. Such animal 
- - - models of disease-express-genes-ftypically-cVrrVih 
5 mutation], which are introduced into the animal on a MAC and which 
induce the disease or disorder in the animal. Similarly, MACs carrying 
- genes encoding antisense RNA^may be introduced into animal cells to 
generate conditional "knock-out" transgenic (non-human) animals. In 
such animals, expression of the antisense RNA results in decreased or 
10 complete elimination of the products of genes corresponding to the 

antisense RNA. Of further interest are transgenic mammals that harbor 
MAC-carried genes encoding therapeutic proteins that are expressed in 
the animal's milk. Transgenic (non-human) animals for use in 
^ xendtransplantation, which express MAC-carried genes that serve to 
15 humanize the animal's organs, are also of interest. Genes that might be 
used in humanizing animal organs include those encoding human surface 
antigens. 

Methods for doning centromeres, such as mammalian 
centromeres, are also provided. In particular, in one embodiment, a 
20 library composed of fragments of SATACs are cloned into YACs [yeast 
artificial chromosomes] that include a detectable marker, such as DNA 
encoding tyrosinase, and then introduced into mammalian cells, such as 
albino mouse embryos. Mice produced from embryos containing such 
YACs that include a centromere that functions in mammals will express 
!5 the detectable marker. Thus, if mice are produced from albino mouse 
embryos into which a functional mammalian centromere was introduced, 
the mice will be pigmented or have regions of pigmentation. 
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A method for producing repeated tandem arrays of DNA is 
provided. This method, exemplified herein using telomeric , DNA, is 
applicable to any repeat sequence, and in particular, low complexity 
repeats. The method provided herein for synthesis of arrays of tandem 
5 DNA repeats are based in a series of extension steps in which successive 
doublings of a sequence of repeats results in an exponential expansion of 
the array of tandem repeats. An embodiment of the method of 
synthesizing DNA fragments containing tandem repeats may generally be 
described as follows. Two oligonucleotides are used as starting 
10 materials. Oligonucleotide 1 is of length k of repeated sequence (the 
flanks of which are not relevant) and contains a relatively short stretch 
(60-90 nucleotides) of the repeated sequence, flanked with appropriately 
chosen restriction sites:- • : 

5'-S1 >>>>>>>>>>>>>>>>>>>> > > > > > > >S2 -3' 

15 where SI is restriction site 1 cleaved by El, S2 is a second restriction ; 

site cleaved by E2 > represents a simple repeat unit, and ' ' denotes^a 

short (8-10) nucleotide flanking sequence complementary to 
oligonucleotide 2: 

3'- _S3-5' 

20 where S3 is a third restriction site for enzyme E3 and which Js present in 
the vector to be used during the construction. The method involves the 
following steps: (1) oligonucleotides 1 and 2 are annealed; (2) the 
annealed oligonucleotides are filled-in to produce a double-stranded (ds) 
sequence; (3) the double-stranded DNA is cleaved with restriction 

25 enzymes El and E3 and subsequently iigated into.a vector ( e.g. . pUC19 
or a yeast vector) that has been cleaved with the same enzymes El and 
E3; (4) the insert is isolated from a first portion of the plasmid by 
digesting with restriction enzymes El and E3, and a second portion of 
the plasmid is cut with enzymes E2 (treated to remove the 3'-overhang) 
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and E3. and the large fragmeht (plasmid DNA plus the insert) is isolated; 
■ (5) the two DNA fragments (the S1-S3 insert fragment and the vector 
- plus insert) are ligated; and (6) steps 4 and 5 are repeated as many times 
- as-needed to-achieve-the desired repeat-sequence size: "iK each" ' ~ " " " 
5 extension cycle, the repeat sequence size doubles, Le^, if m is the 

number of extension cycles, the size of the repeat sequence will be k x 
2" nucleotides. 

DESCRIPTION OF THE DRAWINGS 

Figure T is a schematic drawing depicting formation of the 
10 MMCneo (the minichromosome] chromosome. A-G represents the 

• successive events consistent with observed data that would lead to the 
formation and stabilization of the minichromosome. 

Figure 2 shows a schematic summary of the manner in which the 
observed new chromosomes would form, and the relationships among 
15 the different de novo formed chromosomes. In particular, this figure 
shows a schematic drawing of the de novo chromosome formation 
initiated in the centromeric region of mouse chromosome 7. (A) A single 
E-type amplification in the centromeric region of chromosome 7 
generates a neo-centromere linked to the integrated "foreign" DNA, and 
20 forms a dicentric chromosome. Multiple E-type amplification forms the A 
neo-chromosome, which separates from the remainder of mouse 
chromosome 7 through a specific breakage between the centromeres of 
the dicentric chromosome and which was stabilized In a mouse-hamster 
hybrid cell line; (B) Specific breakage between the centromeres of a 
25 dicentric chromosome 7 generates a chromosome fragment with the neo- 
centromere, and a chromosome 7 with traces of heterologous DNA at the 
end; (C) Inverted duplicatiori of the fragment bearing the neo-centromer 
results in the formation of a stable neo-minichrbmosome; (D) Integration 
of exog nous DNA into the heterologous DNA region of the formerly 
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dicentric chromosome 7 initiates , H-type amplification, and the iormation 
of a heterochromatic arm. By capturing a euchromatic terminal segment, 
this new chromosome arm is stabilized in the form of the "sausage" 
chromosome; (E) BrdU [5-bromodeoxyuridine] treatment and/or drug 
5 selection induce further H-type amplification, whjch results in the 

formation of an unstable gigachromosome: (F) Repeated BrdU. treatments 
and/or drug selection induce further H-type amplification including a 
centromere duplication, which leads to the forrnation of another 
heterochromatic chromosome arm. It is split off from the chromosome 7 
10 by chromosome breakage, and by acquiring a terminal segment, the* ; 
stable megachromosome is formed. 

Figure 3 is a schematic diagrarn of the replicon structure and a 
scheme by which a megachromosome could be produced. 

Figure 4 sets forth the relationships among some of the exemplary 
15 cell lines described herein. r 

Figure 5 is a diagram of the plasmid pTEMPUD. 
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

Definitions 

Unless defined otherwise, all technical and scientific terms used 
20 herein have the same meaning as is commonly understood by one of skill 
in the art to which this invention belongs. All patents and publications 
referred to herein are incorporated by reference. . 

As used herein, a mammalian artificial chromosome IMAC] is a 
piece of DNA that can stably replicate and segregate alongside - 
25 endogenous chromosomes. It has the capacity to accommodate and 
express heterologous genes inserted therein. It is referred to as a 
mammalian artificial chromosome because it includes an active . , 
mammalian centromere(s). Plant artificial chromosomes, insect artificial 
chromosomes and avian artificial chromosomes refer to chromosomes 
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that include plant and insect centrdnrieres/respectively. 
artificial chromosome [HAC] refers to chromosomes that include human 
centromeres/ BUGACs refer to insect artificial chromosomes/ and AVACs 
■^—^ — ^refer-to-avian artificial chromosomes.-^ Among the MACs provided hereirf 
5 are SATACs, minichromosomes, and in vitro synthesized artificial 
chromosomes. Methods for construction of each type are provided 
'herein.." .. '■'''J-'' . . 

As used herein, in vitro synthesized artificial chromosomes are 
artificial chromosomes that is produced by joining the essential 
10 components (at least the centromere, and origins of replication) in vitro . 
As used herein, endogenous chromosomes refer to genomic 
chromosomes as found in the cell prior to generation or introduction of 
a MAC. _ - "■" V" .-■ 

As used herein, stable maintenance of chromosomes occurs when 
15 at least about 85%, preferably 90%, more preferably 95%, of the cells 
retain the chromosome. Stability is measured in the presence of a 
selective agent. Preferably these chromosomes are also maintained in 
the absence of a selective agent. Stable chromosomes also retain their 
structure during cell culturing, suffering neither intrachromosomal nor 
20 interchromosomal rearrangements. 

As used herein, groNArth under selective conditions means groNArth 
of a cell under conditions that require expression of a selectable marker 
for survival. 

As used herein, an agent that destabilizes a chromosome is any 
25 agent known by those of skill in the art to enhance amplification events, 
mutations. Such agents, which include BrdU, are well known to those of 
skill in the art. 
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As used herein, £/e /7ov^a, with reference to a centromere, refers to 
generation of an excess centromere as a result of incorporation of a 
heterologous DNA fragment using the methods herein. 

As used herein, euchromatin and heterochromatin have their 
5 recognized meanings, euchromatin refers to chromatin that stains 

diffusely and that typically contains genes, and heterochromatin refers to 
chromatin that remains unusually condensed and that has been thought 
to be transcriptionally inactive. Highly repetitive DNA sequences 
[satellite DNA], at least with respect to mammalian cells, are usually 
10 located in regions of the heterochromatin surrounding the centromere . 
[pericentric heterochromatin]. Constitutive heterochromatin refers to 
heterochromatin that contains the highly repetitive DNA which is 
constitutively condensed and genetically inactive. 

As used herein, BrdU refers to 5-bromodeoxyuridine, which during 
15 replication is inserted in place of thymidine. BrdU is used as a mutagen;; it 
also inhibits condensation of metaphase chromosomes during cell 
division. 

As used herein, a dicentric chromosome is a chromosome that 
contains two centromeres. A multicentric chromosome contains more 

20 than two centromeres. 

As used herein, a formerly dicentric chromosome is a chromosome 
that is produced when a dicentric chromosome fragments and acquires 
new telomeres so that two chromosomes, each having one of the 
centromeres, are produced. Each of the fragments are replicable 

25 chromosomes. If one of the chromosomes undergoes amplification of 

euchromatic DNA to produce a fully functional chromosome that contains 
the newly introduced heterologous DNA and primarily [at least more than 
50%] euchromatin, it is a minichromosome. The remaining chromosome 
is a formerly dicentric chromosome. If one of the chromosomes 
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undergoes amplification, whereby heterochromatin [satellite DN 
amplified and a euchromatic portion [or arm] remains, it is referred to as 
a sausage chromosome. A chromosome that is substantially all 
^ ~ heterochromatinr except-f or portions of "heterologous~DNi^^ called a ^ 
5 SATAC. Such chromosomes [SATACs] can be produced from sausage 
chromosomes by culturing the cell containing the sausage chromosome 
under conditions, such as BrdU. treatment and/or growth under selective 
conditions, that destabilize the chromosome so that a satellite artificial 
chromosomes [SATAC] is produced. For purposes herein, it is 
10 understood that SATACs may not necessarily be produced in multiple 
steps, but may appear after the initial introduction of the heterologous 
DNA and growth under selective conditions, or they may appear after 
several cycles of growth under selective conditions and BrdU treatment. 
As used herein, a SATAC refers to a chromosome that is 
15 substantially all heterochromatin, except for portions of heterologous 
DNA. Typically, SATACs are satellite DNA based artificial 
chromosomes, but the term enompasses any chromosome made by the 
methods herein that contains more heterochromatin than euchromatin. 
As used herein, ampiifiable, when used in reference to a 
20 chromosome, particularly the method of generating SATACs provided 

herein, refers to a region of a chromosome that is prone to amplification. 
Amplifcation typically occurs during replication and other cellular events 
involving recombination. Such regions are typically regions of the 
chromosome that include tandem repeats, such as satellite DNA, rDNA 
25 and other such sequences. 

As used herein, amplification, with reference to DNA, is a process 
in which segments of DNA are duplicat d to yield two or multiple copies 
- of identical or nearly identical DNA segm nts that ar typically joined as ' 
substantially tandem or successive repeats or inverted r peats. 
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As used herein an amplicon is a repeated DNA amplification unit 
that contains a set of inverted repeats of the megarepliqon, A 
megareplicon represents a higher order replication unit. For example, 
with reference to the SATACs, the megareplicon contains a set of 
5 tandem DNA blocks each containing satellite DNA flanked by non- 
satellite DNA. Contained within the megareplicon is a primary replication 
site, referred to as the megareplicator, which may be involved in 
organizing and facilitating replication of the pericentric Heterochromatin 
and possibly the centromeres. Within the megareplicon there may be 

10 smaller [e.g., 50-300 kb in some mammalian cells] secondary replicons. 
In the exemplified SATACS, the megareplicon is defined by two tandem 
— 7.5 Mb DNA blocks [see, e.a. . Fig. 3). Within each artificial 
chromosome [AC] or among a population thereof, each amplicon has the 
same gross structure but may contain sequence variations. Such 

15 variations will arise as a result of movement of mobile genetic elements, 
deletions or insertions or mutations that arise, particularly in culture.,/ 
Such variation does not affect the use, of the ACs or their overall 
structure as described herein. 

As used herein, ribosomal RNA [rRNA] is the specialized RNA that 

20 forms part of the structure of a ribosome and participates in the 

synthesis of proteins. Ribosomal RNA is produced by transcription of 
genes which, in eukaryotic cells, are present in multiple copies. In 
human cells, the approximately 250 copies of rRNA genes per haploid 
genome are spread out in clusters on at least five different chromosomes 

25 (chromosomes 13, 14, 15, 21 and 22). In mouse cells, the presence of 
ribosomal DNA [rDNA] has been verified on at least 11 pairs out of 20 
mouse chromosomes [chromosomes 5, 6, 9, 11, 12, 15,. 16, 17, 18, 19 
and X][see e.g., Rowe ^ aL (1996) Mamm. Genome 7:886-889 and 
Johnson etaL{^ 993) Mamm. Genome 4:49-52]. In eukaryotic cells, the 
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the highly conserved rRNA genes are Jocated in a 
tandemly arranged series of rDNA units, which are generally about 40-45 
kb in length and contain a transcribed region and a nontranscribed region 
— - — known-aS'Spacer (kei:,-intergenic spaeer)-DNA which can vary in length--? 
5 and sequence. In the hunnan and mouse, these tandem arrays of rDNA 
units are located adjacent to the pericentric satellite DNA sequences 
(heterochromatin). The regions of these chromosomes in which the 
rDNA Is located are referred to as nucleolar organizing regions (NOR) 
which loop into the nucleolus, the site of ribosome production within the 
10 cell nucleus. 

As used herein; the minichromosome refers to a chromosome 
derived from a multicentric, typically dicentric, chromosome [see, e.g. , 
; FIG. 1] that contains more euchromatic than heterochromatic DNA. 

- i As used herein; a megachromosome refers to a chromosome that, 
15 except for introduced heterologous DNA, is substantially composed of 
heterochromatin. Megachromosomes are made of an array of repeated 
amplicons that contain two inverted megareplicons bordered by 
introduced heterologous DNA [see, e.g. . Figure 3 for a schematic 
drawing of a megachromosome]; For purposes herein, a 
20 megachromosome is about 50 to 400 Mb, generally about 250-400 IVIb, 
Shorter variants are also referred to as truncated megachromosomes 
[about 90 to 120 or 1 50 Mb], dwarf megachromosomes [ - 1 50-200 Mb] 
and cell lines, and a micro-megachromosome [ — 50-90 Mb, typically 50- 
60 Mb]. For purposes herein, the term megachromosome refers to th 
25 overall repeated structure based on an array of repeated chromosomal 
segments [amplicons] that contain two inverted megareplicons bordered 
by any inserted heterologous DNA. The size will be specified. 
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As used herein, genetic therapy involves the transfer or insertion 
of heterologous DNA into certain cells, target cells, to produce specific 
gene products that are involved in correcting or modulating disease. The 
DNA is introduced into the selected target cells in a manner such that the 
5 heterologous DNA is expressed and a product encoded thereby is 
produced. Alternatively, the heterologous DNA may in some manner 
mediate expression of DNA that encodes the therapeutic product, it may 
encode a product, such as a peptide or RNA, that in some manner 
mediates, directly or indirectly, expression of a therapeutic product. 

10 Genetic therapy may also be used to introduce therapeutic compounds, 
such as TNF, that are not normally produced in the host or that are not 
produced in therapeutically effective amounts or at a therapeutically 
useful time. Expression of the heterologous DNA by the target cells 
withiri an organism afflicted with the disease thereby enables modulation 

15 of the disease. The heterologous DNA encoding the therapeutic product 
may be modified prior to introduction into the cells of the afflicted host in 
order to enhance or otherwise alter the product or expression thereof. 

As used herein, heterologous or foreign DNA and RNA are used 
interchangeably and refer to DNA or RNA that does not occur naturally 

20 as part of the genome in which it is present or which is found in a : 
location or locations in the genome that differ from that in which it 
occurs in nature. It is DNA or RNA that is not endogenous to the cell 
and has been exogenously introduced into the cell. Examples of 
heterologous DNA include, but are not limited to, DNA that encodes a 

25 gene product or gene product(s) of interest, introduced for purposes of 
gene therapy or for production of an encoded protein. Other examples 
of heterologous DNA include, but are not limited to, DNA that encodes 
traceable marker proteins, such as a protein that confers drug resistance, 
DNA that encodes therapeutically effective substances, such as anti- 



97/40183 



PCTAJS97/a5911 



-24- 

cancer agents, enzymes and hormones, and DNA that encodes other 
types of proteins, such as antibodies. Antibodies that are encoded by 
heterologous DNA may be secreted or expressed on the surface of the 
cell-in which the heterologous DNA has been introduced- ^ 

As used herein, a therapeutically effective product is a product 
that is encoded by heterologous DNA that, upon introduction of the DMA 
into a host, a product is expressed that effectively ameliorates or 
eliminates the symptoms, manifestations of an inherited or acquired 
disease or that cures said disease. 

As used herein, transgenic plants refer to plants in which 
heterologous or foreign DNA is expressed or in which the expression of a 
gene naturally present in the plant has been altered. 

As used herein, operative linkage of heterologous DNA to 
regulatory and effector sequences of nucleotides, such as promoters, 
enhancers, transcriptional and translational stop sites, and other signal 
sequences refers to the relationship between such DNA and such 
sequences of nucleotides. For example, operative linkage of 
heterologous DNA to a promoter refers to the physical relationship 
between the DNA and the promoter such that the transcription of such 
DNA is initiated from the promoter by an RNA polymerase that 
specifically recognizes, binds to and transcribes the DNA in reading 
frame.- Preferred promoters include tissue specific promoters, such as 
mammary gland specific promoters, viral promoters, such TK, CMV, 
adenovirus promoters, and other promoters known to those of skill in the 
art. 

As used herein, isolated, substantially pure DNA refers to DNA 
fragments purified according to standard techniques employed by those 
skilled in the art, such as that found in Maniatis et al. fd 982) Molecular 
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Cloning : A Laboratory Manual . Cold Spring Harbor Laboratory Press, Cold 

Spring Harbor, NY]. , rix :, 

As used herein, expression refers to the process by, which nucleic 

acid is transcribed into mRNA and translated into peptides, polypeptides, 
5 or proteins. If the nucleic acid is derived from genomic DNA, expression 

may, if an appropriate eukaryotic host cell or organism is selected, 

include splicing of the mRNA. 

As used herein, vector or plasmid refers to discrete elements that 

are used to introduce heterologous DNA into cells for either expression of 
10 the heterologous DNA or for replication of the cloned heterologous DNA. 

Selection and use of such vectors and plasmids are well within the level 

of skill of the art. 

As used herein, transformation/transfection refers to the process 

by which DNA or RNA is introduced into cells, Transfection refers to 
15 the taking up of exogenous nucleic acid, e.g., an expression vector, by a 

host cell whether or not any coding sequences are in fact expressed.^ 

Numerous methods of transfection are known to the ordinarily skilled 

artisan, for example, by direct uptake using calcium phosphate [CaP04; 

see, e.o. . Wigler et aL ( 1 979) Proc. Natl. Acad. Sci. U.S.A. 76 : 1 373- 
20 1376], polyethylene glycol [PEG]-mediated DNA uptake, electroporation, 

lipofection [see, e.g. . Strauss (1996) Meth. Mol. Biol. 54 :307-3271, 

micrpcell fusion [see, EXAMPLES, see, also Lambert (1 991) Proc. Natl. 

Acad. Sci. U.S.A. 88 :5907-5911 : U.S. Patent No. 5,396,767, Sawford 

et al. (1987) Somatic Cell Mol. Genet. 13:279-284: Dhar et al. (1984) 
25 Somatic Cell Mol. Genet. 10:547-559; and McNeill-Killary et aL (1 995) 

Meth. Enzvmol. 254 :1 33-1 52L lipid-mediated carrier systems [see, e.g., 

Teifel et aL ( 1 995) Biotechniques 19:79-80; Albrecht et aL (1 996) Ann. 

HematoL 72 :73-79: Holmen et al. (1995) In Vitro Cell Dev. Biol. Anim. 

31:347-351: REmy et aL (1994) Bioconiuo, Chem. 5:647-654; Le Bolch 
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et aL (1995) Tetrahedron Lett. 36:6681-6684: Loeffler et aL (1 993) 
Meth: EnzvmoL 217 :599-61 8] or other suitable method. Successful 
transfeetion is generally recognized by detection of the presence of the 

— ^- heterologous nucleic acid within the-transfected cell, such as any ^ — - 

5 indication of the operation of a vector within the host cell. 

Transformation means introducing DNA into an organism so that the 
DNA is repiicable, either as an extrachromosomal element or by 
chromosomal integration. 

As used herein, injected refers to the microinjection [use of a small 

10 syringel of DNA into a cell. 

As used herein, substantially homologous DNA refers to DNA that 
includes a sequence of nucleotides that is sufficiently similar to another 
such sequence to form stable hybrids under specified conditions. 
^ It is Well known to those of skill in this art that nucleic acid 

15 fragments with different sequences may, under the same conditions, 

hybridize detectably to the same "target" nucleic acid. Two nucleic acid 
fragments hybridize detectably, under stringent conditions over a 
sufficiently long hybridization period, because one fraigment contains a 
segment of at least about 14 nucleotides in a sequence which is 

20 complementary [or nearly complementary] to the sequence of at least 

one segment in the other nucleic acid fragment. If the time during which 
hybridization is allowed to occur is held constant, at a value during 
which, under preselected stringency conditions, two nucleic acid 
fragments with exactly complementary base-pairing segments hybridize 

25 detectably to each other, departures from exact complementarity can be 
introduced into the base-pairing segments, and base-pairing will 
nonetheless occur to an extent sufficient to make hybridization 
detectable. As the departure from complementarity between the base- 
pairing segments of two nucleic acids b comes larger, and as conditions 
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of the hybridization become more stringent, the probability decreases 
that the two segments will hybridize detectably to each other- 
Two single-stranded nucleic acid segments have "substantially the 
same sequence," within the meaning of the present specification, if 
5 (a) both form a base-paired duplex with the same segment, and (b)'the 
nielting temperatures of said two duplexes in a solution of 0.5 X SSPE 
differ by less than lOoC. If the. segments being compared have the sam 
number of bases, then to have "substantially the same sequence", they 
will typically differ In their sequences at fewer than 1 base in 10. 

10 Methods for determining melting temperatures of nucleic acid duplexes , 
are well known Isee, e.g. . Meinkoth and Wahl (1984) AnaL Biochem . 
138 :267-284 and references cited therein!. 

As used herein, a* nucleic acid probe is a DNA or RNA fragment 
that includes a sufficient number of nucleotides to specifically hybridize 

15 to DNA or RNA that includes identical or closely related sequences of 

nucleotides. A probe may contain any number of nucleotides, from as c 
few as about 10 and as many as hundreds of thousands of nucleotides. 
The conditions and protocols for such hybridization reactions are well 
known to those of skill in the art as are the effects of probe size, 

20 temperature, degree of mismatch, salt concentration and other 

parameters on the hybridization reaction. For example, the lower the 
temperature and higher the salt concentration at which the hybridization 
reaction is carried out, the greater the degree of mismatch that may be 
present in the hybrid molecules. 

25 To be used as a hybridization probe, the nucleic acid is generally 

rendered detectable by labelling it with a detectable moiety or label, such 
as ^^P, and ^'^C, or by other means, including chemical labelling, such 
as by nick-translation in the presence of deoxyuridylate biotinylated at 
the 5'-position of the uracil moiety. The resulting probe includes the 
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V - biotinylated uridylate in place of thymidy^^^^^ residues and can be 
^ . detected [via the biotin moieties] by any of a number of commercially 

available detection systems based on binding of streptavidin to the biotin. 
__Such commercially-available -detection-systems can-be obtaihecirfor* ~~ ~ 
5 example/from Enzo Biochemicals, Inc. (New York, NY]. Any other label 
known to those of skill in the art, including non-radioactive labels, may 
be used as long as it renders the probes sufficiently detectable, which is 
a function of the sensitivity of the assay, the time available [for culturing 
cells, extracting DNA, and hybridization assays], the quantity of DNA or 
10 RNA available as a source of the probe, the particular label and the 
means used to detect the label. 

Once sequences with a sufficiently high degree of homology to the 
• probe are identified, they can readily be isolated by standard techniques, 

which are described, for example, by Maniatis et aL ((1982) Molecular 
15 ; Cloning: A L aboratory Manual, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY). 

As used herein, conditions under which DNA molecules form 
stable hybrids and are considered substantially homologous are such that 
DNA molecules with at least about 60% complementarity form stable 

20 hybrids. Such DNA fragments are herein considered to be "substantially 
homologous". For example, DNA that encodes a particular protein is 
substantially homologous to another DNA fragment if the DNA forms 
stable hybrids such that the sequences of the fragments are at least 
about 60% complementary and if a protein encoded by the DNA retains 

25 its activity. 

■ ■ " For purposes herein, the following stringency conditions are 
defined: 

" ^ 1) high stringency: 0.1 x SSPE, 0. 1 %"SDS, 65«'C " 

2) medium stringency: 0.2 x SSPE, 0. 1 % SDS, 50°C 
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3) low Stringency: 1 .0 x SSPE, 0.1 % SDS, 50°C 
or any combination of salt iand temperature and other reagents that result 
in selection of the same degree of mismatch or matching. ^ v 

As used herein, immunoprotective refers to the ability of a vaccine 
5 or exposure to an antigen or immunity-inducing agent, to confer upon a 
host to whom the vaccine or antigen is administered or introduced, the 
ability to resist infection by a disease-causing pathogen or to have 
reduced symptoms. The selected antigen is typically an antigen that is 
presented by the pathogen. 
10 As used herein, all assays and procedures, such as hybridization , 

reactions and antibody-antigen reactions, unless otherwise specified, are 
conducted under conditions recognized by those of skill in the art as 
standard conditions. 

A. Preparation of cell lines containing MACs 
15 1. The megareplicon v 

The methods, cells and MACs provided herein are produced by , 
virtiie of the discovery of the existence of a higher-order replication unit 
[megareplicon] of the centromeric region. This megareplicon is delimited 
by a primary replication initiation site [megareplicator], and appears to 

20 facilitate replication of the centromeric heterochromatin, and most likely, 
centromeres. Integration of heterologous DNA into the megareplicator 
region or in close proximity thereto, initiates a large-scale amplification of 
megabase-size chromosomal segments, which leads to de novo 
chromosome formation in living cells. 

25 DNA sequences that provide a preferred megareplicator are the 

rDNA units that give rise to ribosomal RNA (rRNA). In mammals, 
particularly mice and humans, these rDNA units contain specialized 
elements, such as the origin of replication (or origin of bidirectional 
replication, i.e. . OBR, in mouse) and amplification promoting sequences 
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(APS) and amplification control elem^ e.g., Gogel et aL 

(1996) Chromosoma 1W:51 1-518; Coffman et aL {1 993) Ex^ 
209:123-132; Uttle M M. {1993) MoL^ i^ 
---al^H 995)-iy!oi7 e^ll Biot 15:2482-2489rG^^^ 995,- 
5 Genomics 22:320-328; Miesfeld and Arnheim (1982) islucv Acids Res. 

As described herein; witbout being bound by any theory, these 
specialized elements may facilitate replication and/or amplification of ■ 
megabase-size chromosomal segments in the de novo formation of 

10 chromosomes, such as those described herein, in cells. These 

specialized elements are typically located in the nontranscribed intergenic 
spacer region upstream of the transcribed region of rDNA. The intergenic 
spacer region may itself oontain internally repeated sequences which can 
be classified as tandemly repeated blocks and nontahdem blocks (see 

15 e^, Gonzalez and Sylvester (1995) GenoQTics 27:320-328). In mouse 
rDNA, an origin of bidirectional replication may be found within a 3-kb 
Initiation zone centered approximately 1.6 kb upstream of the 
transcription start site (see, e^, Gogel et aL (1996) Chromosoma 
1fi4:51 1-518). The sequences of these specialized elements tend to 

20 have an altered chromatin structure, which may be detected, for 

example, by nuclease hypersensitivity or the presence of AT-rich regions 
that can give rise to bent DNA structures. An exemplary sequence . 
encompassing ah origin of replication is shown in SEQ ID NO. 16 and in 
GENBANK accession no. X82564 at about positions 2430-5435. 
5 Exemplary sequences encompassing amplification-promoting sequences 
include nucleotides 690-1060 and 1105-1530 of SEQ ID NO: 16. 

In human rDNA, a primary r plication initiation site may be found a 
few kilobase pairs upstf a^ 

initiation sites may be found throughout the nontranscribed int rgenic 
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spacer region (see, e.g.. Yoon et aL (1 995) MoL Cell. Biol. 15 : 2482- 
2489), A complete human rDNA repeat unit is presented in GENBANK 
as accession no. U13369 and is set forth in SEQ ID NO. 17 herein. 
Another exemplary sequence encompassing a replication initiation site 
5 may be found within the sequence of nucleotides 35355-42486 in^ 
SEQ ID NO. 17 particularly within the sequence of nucleotides 3791 2- 
42486 and nnore particularly within the sequence of nucleotides 3791 2- 
39288 of SEQ ID NO. 1 7 (see Coffman et al. (1993) Exo. Cell. Res, 
209:123-132). . - 

10 Cell lines containing MACs can be prepared by transforming cielis, ^ 

preferably a stable cell line, with a heterologous DNA fragment that 
encodes a selectable marker, culturing under selective conditions, and 
identifying cells that have a multicentric, typically dicentric, chromosome. 
These cells can then be manipulated as described herein to produce the 

15 minichromospmes and other MACs, particularly the heterochromatic - 
SATACs, as described herein. ^ % 

•■ ■ ■ ■ ■ 

Pevelopment of a multicentric, particularly dicentric, chromosome 
typically is effected through integration of the heterologous DNA in the 
pericentric heterochromatin, preferably in the centromeric regions of 

20 chromosomes carrying rDNA sequences. Thus, the frequency of 

incorporation can be increased by targeting to these regions, such as by 
including DNA, including, but not limited to, rDNA or satellite DNA, in the 
heterologous fragment that encodes the selectable marker. Among the 
preferred targeting sequences for directing the heterologous DNA to the 

25 pericentromeric heterochromatin are rDNA sequences that target 
centromeric regions of chromosomes that carry rRNA genes. Such 
sequences include, but are not limited to, the DNA of SEQ ID NO. 16 and 
GENBANK accession no. X82564 and portions ther of, the DNA of SEQ 
ID NO. 17 and GENBANK accession no. U 13369 and portions thereof , 
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and the DNA of SEQ ID NOS: ^1 8-24: A particular Uctor ihdbrporating 
DNA from within SEQ ID NO. 16 for use in directing integration of 
heterologous DNA into chromosomal rDNA is pTERPUD (see Example 

1 2)-- -Satellite -DNA -sequences can-als"o~l5e~usecl to '~ 

5 heterologous DNA to integrate into the pericentric heterochromatin. For 
example, vectors pTEMPUD and pHASPUD, which contain mouse and 
human satellite DNA, respectivejy, are provided herein (see Example 12) 
as exemplary vectors for introduction of heterologous DNA into cells for 
</e A7ovo artificial chromosome formation. 
10 The resulting cell lines can then be treated as the exemplified cells 

herein to produce cells in which the dicentric chromosome has 
fragmented. The cells can then be used to introduce additional selective 
markers into the fragmented dicentric chromosome ( i.e. . formerly 
dicentric chromosome), whereby amplification of the pericentric 
15 heterochromatin will produce the heterochromatic chromosomes. 

The following discussion describes this process with reference to 
the EC3/7 line and the resulting cells. The same procedures can be 
applied to any other cells, particularly ceir lines to create SATACs and 
euchromatic minichromosomes. 

2. Formation of </e novo chromosomes 
De /70V0 centromere formation in a transformed mouse 
LMTK-fibroblast cell line [EG3/7] after cbintegration of /» constructs 
WCM8 and /igtWESneoJ carrying human and bacterial DNA [Hadlaczky et 
al. (1991) Proc. Na tl. Acad. Sci. U.S.A. aRrfiinfi-Rl a»H i i g 
25 application Serial No. 08/375,271] has been shown. The integration of 
the "heterologous" engineered human, bacterial and phage DNA, and the 
subsequent anriplification of mouse and h terologous DNA that led to the 
formation of a dicentric chrbmosbme, occurred iat the "c Wtromeric^^^^^ 
of the short arm of a mouse chromosome. By G-banding, this 
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chromosome was identified as mouse chromosome 7. Because of the 
presence of two functionally active centromeres on the same 
chromosome, regular breakages occur between the centromeres. Such 
specific chromosome breakages gave rise to the appearance [in 
5 approximately 10% of the cells] of a chromosome fragment carrying the 
neo-centromere. From the EC3/7 cell line {see, U.S. Patent No. 
5,288,625, deposited at the European Collection of Animal Cell Culture 
(hereinafter ECACC) under accession no. 90051001 ; see, also Hadlaczky 
et aL (1991) Proc, Natl, Acad. Sci. U.S.A. 88 :8106-8110, and U.S. 

10 application Serial No. 08/375,271 and the corresponding published 
European application EP 0 473 253, two sublines [EC3/7C5 and 
EC3/7C6] were selected by repeated single-cell cloning. In these cell 
lines, the neo-centromere was found exclusively on a minichromosome 
[neo-minichromosome], while the formerly dicentric chromosome carried 

15 traces of "heterologous" DNA. 

It has now been discovered that Integration of DNA encoding a 
selectable marker in the heterochromatic region of the centromere led to 
formation of the dicentric chromosome. 
3, The neo-minichromosome 

20 The chromosome breakage in the EC3/7 cells, which separates the 

neo-centromere from the mouse chromosome, occurred in the G-band 
positive "heterologous" DNA region. This is supported by the observation 
of traces of and human DNA sequences at the broken end of the 
formerly dicentric chromosome. Comparing the G-band pattern of the 

25 chromosome fragment carrying the neo-centromere with that of the 

stable neo-minichromosome, it is apparent that the neo-minichromosome 
is an inverted duplicate of the chromosome fragment that bears the neo- 
c ntrom re. This is supported by the observation that although the neo- 
minichromosome carries only one functional centromere, both ends of 
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the hniniGhromosome are heterochromatic; and itiouse satellite DNA 
sequences were found in these heterochromatic regions by />7 s/rt/ 

i-is.- :.-hyhridizationv. ;-y';/\ ■■■■^ -■•■4^ .c,;. - v.' 

-- --W---" -Mouse-cells containing the mi^ 

5 multiple repeats of the heterologous DHA, which in the exemplified 

embodinnent is /» DNA and the neomycin-resistance gene, can be used as 

>v recipient cells in cell transformation. Donor DNA, such as selected 

heterologous DNA containing DNA linked to a second selectable — 
marker, such as the gene encoding hygromycin phosphotransferase 

10 which confers hygromycin resistance [hyg], can be introduced into the 
mouse cells and integrated into the minichromosomes by homologous 
recombination of A DNA in the donor DNA with that in the 
J : minichromosomes. Integration is verified by /r7 s/?£y hybridization and 

? \ > blot analyses. Transcription and translation of the heterologous 

15 DNA is confirmed by primer extension and immunoblot analyses. 

For example, DNA has been targeted into the neo-minichromosome 
:^ in EC3/7C5 cells using a A DNA-containing construct [pNeml rue] that 
also contains DNA encoding hygromycin resistance and the Renilla 
luciferase gene linked to a promoter, such as the cytomegalovirus ICMV] 
20 early promoter, and the bacterial neomycin resistance-encoding DNA . 
Integration of the donor DNA into the chromosome in selected cells 
[designated PHN4] was confirmed by nucleic acid amplification [PGR] and 
in situ hybridization. Events that would produce a neo-minichromosome 
are depicted in Figure 1. 

The resulting engineered minichromosome that contains the 
heterologous DNA can then be transferred by cell fusion into a recipient 
ceil line, such as Chinese hamster ovary cells [CHO] and correct 
- expression of the heterologous DNA can be verified. Following - - 
production of the c lis, metaphase chromosomes are obtained, such as 
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by addition of colchicine/ and the chromosomes purified by addition of 
AT- and GC-specific dyes on a dual laser beam based cell sorter (see 
Example 10 B for a description of methods of isolating artificial 
chromomsomes). Preparative amounts of chromosomes [5 x 10* - 5 x 
5 10^ chromosomes/ml] at a purity of 95% or higher can be obtained. The 
resulting chromosomes are used for delivery to cells by methods such as 
microinjection and liposorhe-mediated transfer. 

Thus, the neo-minichromosome is stably maintained in cells, 
replicates autonomously, and permits the persistent long-term expression 

10 of the neo gene under non-selective culture conditions. It also contains 
megabases of heterologous known DNA M DNA in the exemplified 
embodiments] that serves as target sites for homologous recombination 
and integration of DNA of interest. The neo-minichromosome is, thus, a 
vector for genetic engineering of cells. It has been introduced into SCID 

15 mice, and shown to replicate in the same manner as endogenous ^ 
chromosomes. ^ 

The methods herein provide means to induce the events that lead 
to formation of the neo-minichromosome by introducing heterologous 
DNA with a selective marker [preferably a dominant selectable marker] 

20 into cells and culturing the cells under selective conditions. As a result, 
cells that contain a multicentric, e.g., dicentric chromosome; or 
fragments thereof, generated by amplification are produced. Cells with 
the dicentric chromosome can then be treated to destabilize the 
chromosomes with agents, such as BrdU and/or culturing under selective 

25 conditions, resulting in cells in which the dicentric chromosome has 

formed two chromosomes, a so-called minichromosome, and a formerly 
dicentric chromosome that has typically undergone amplification in the 
h terochromatin where the heterologous DNA has integrated to produce 
a SATAC or a sausage chromosome [discussed below]. Th se cells can 



H 7 be fused yyith Other cells to separate the minichromosome from the 
A formerly ^dice into different cells so that each type of 

M 

4. Preparation of SATACs 

5 An exemplary protocol for preparation of SATACs is illustrated in 

Figure 2 [particularly E and F] and FIGURE 3 [see, also the 
EXAMPLES, particularly EXAMPLES 4-71. 

To prepare a SATAC, the starting materials are cells, preferably a 

stable cell line, such as a fibroblast cell line, and a DNA fragment that 

10 includes DNA that encodes a selective marker. The DNA fragment is 
introduced into the cell by methods of DNA transfer, including but not 
limited to direct uptake using calcium phosphate, electroporation, and 
lipid-mediated transfer. To insure integration of the DNA fragment in the 
heterochromatin,.it is preferable to start with DNA that will be targeted 

15 to the pericentric heterochromatic region of the chromosome, such as 
y*CM8 and vectors provided herein, such as pTEMPUD [Figure 5] and 
pHASPUD (see Example 12) that include satellite DNA, or specifically 
. jnto rDNA in the centromeric regions of chromosomes containing rDNA 
sequences. After introduction of the DNA, the cells are grown under 

20 selective conditions. The resulting cells are examined and any that have 
multicentric, particularly dicentric, chromosomes [or heterochromatic 
chromosomes or sausage chromosomes or other such structure; see. 
Figure 2D, 2E and 2F] are selected. 

In particular, if a cell with a dicentric chromosonrie is selected, it 

25 can, be gro\A/n under selective conditions, or, preferably,, additional DNA 
encoding a second selectable marker is introduced, and the cells grown 
under conditions selective for the second marker. The resulting cells 
shpuld include chromosomes that have structures similar to those 
depicted in Figures 2D, 2E, 2F. Cells with a structure, such as the 
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sausage chromosome. Figure 2D/can be„ selected and fused with a 
second cell line to eliminate other chromosomes that are not of interest, 
if desired, cells with other chromosomes can be selected and treated as 
described herein. If a cell with a sausage chromosome is selected, it can 
5 be treated with an agent, such as BrdU, that destabilizes the 

chromosome so that the heterochromatic arm forms a chromosome that 
is substantially heterochromatic [i.e., a megachromosome, see. Figure 
2F]. Structures such as the gigachromsome in which the 
heterochromatic arm has amplified but not broken off from the 

TO euchromatic arm, will also be observed. The megachromosome is a 

stable chromosome. Further manipulation, such as fusions and growth in 
selective conditions and/or BrdU treatment or other such treatment, can 
lead to fragmentation of the megachromosome to form smaller 
chromosomes that have the amplicon as the basic repeating unit. 

15 The megachromosome can be further fragmented in vivo using a 

chromosome fragmentation vector, such as pTEMPUD [see. Figure 5 and 
EXAMPLE 12], pHASPUD or pTERPUD (see Example 12) to ultimately 
produce a chromosome that comprises a smaller stable replicable unit, 
about 15 Mb-60 Mb, containing one to four megareplicons. 

20 Thus, the stable chromosomes formed de novo that originate from 

the short arm of mouse chromosome 7 have been analyzed. This 
chromosome region shows a capacity for amplification of large 
chromosome segments, and promotes de novo chromosome formation . 
Large-scale amplification at the same chromosome region leads to the 

25 formation of dicentric and multicentric chromosomes, a minichromosom , 
the 150-200 Mb size A neo-chromosome, the "sausage" chromosome, 
the 500-1000 Mb gigachromospme, and the stable 250-400 Mb 
megachromosome. 
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A clear segmentation is observed along the arms of the ^ 
megachromosomeV and analyses show that the building units of this 
* - chromosome are amplicons of —30 Mb composed of rnouse major 
— - satellite DNA-with-the-integrated- foreign"-DNA~sequenc 
5 The — 30 Mb amplicons are composed of two — 15 Mb inverted doublets 
of —7.5 Mb mouse major satellite DNA blocks/ which are separated from 
each other by a narrow band of non-satellite sequences [see, e.g. . 
Figure 3] . The wider non-satellite regions at the amplicon borders 
contain integrated, exogenous [heterologous] DNA, while the narrow 
10 bands of non-satellite DNA sequences within the amplicons are integral 
parts of the pericentric heterochromatin of mouse chromosomes. These 
results indicate that the -7.5 Mb blocks flanked by non-satellite DNA 
are the building units of the pericentric heterochromatin of mouse 
chromosomes, and the — 15 Mb size pericentric regions of mouse 
15 chromosomes contain two —7.5 Mb units. 

Apart from the euchromatic terminal segments, the whole 
megachromosorne is heterochromatic, and has structural homogeneity. 
Therefore, this large chromosome offers a unique possibility for obtaining 
information about the amplification process, and for analyzing some basic 
20 characteristics of the pericentric constitutive heterochromatin, as a 

vector for heterologous DNA, and as a target for further fragmentation. 

As shown herein, this phenomenon is generalizable and can be 
observed with other chromosomes. Also, although these c/e /70\/o formed 
chromosome segments and chromosomes appear different, there are 
25 similarities that indicate that a similar amplification mechanism plays a 
role in their formation: (i) in each case, the amplification is initiated in th 
cehtrbmeric region of the mouse chromosomes and large (Mb size) 
-^amplicons are formed; (ii) mouse major satellite DNA sequences are - 
constant constituents of the amplicons, eith r by providing the bulk of 



wo 97/40183 



PCTAJS97/05911 



" -39-. 

the heterochromatic amplicdns [H-type amplification], or by bordering the 
aeuchromatic amplicons [E-type amplification]; (iii). formation of inverted 
segments can be demonstrated in the A neo-chromosome and 
megachromosome; (iv) chromosome arms and chromosomes formed by 
5 the amplification are stable and functional. 

The presence of inyerted chromosome segments seems to be a 
common phenomenon in the chromosomes formed de novo at the 
centromeric region of mouse chromosome 7. During the formation of the 
neo-minichromosome, the event leading to the stabilization of the distal 
10 segment of mouse chromosome 7 that bears the neo-centromere may 
have been the formation of its inverted duplicate. Amplicons of the 
megachromosome are inverted doublets of —7.5 Mb mouse major , 
satellite DNA blocks. - _ 

5. Cell lines 

15 Cell lines that contain MACs, such as the minichromosome, the A- 

neo chromosome, and the SATACs are provided herein or can be 
produced by the methods herein. Such cell lines provide a convenient 
source of these chromosomes and can be manipulated, such as by cell 
fusion or production of microcells for fusion with selected cell lines, to 

20 deliver the chromosome of interest into hybrid cell lines. Exemplary cell 
lines are described herein and some have been deposited with the 
ECACC. 

a. EC3/7C5 and EC3/7C6 
Cell lines EC3/7C5 and EC3/7C6 were produced by single cell 
25 cloning of EC3/7. For exemplary purposes EC3/7C5 has been deposited 
with the ECACC. These cell lines contain a minichromosome and the 
formerly dicentric chromosome from EC3/7. The stable minir . 
chromosomes in celt lines EC3/7C5 and EC3/7C6 appear to be the same 
and they seem to be duplicated derivatives of the —10-15 Mb 
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tH "brbkei> chromosome. Their similar size in 

indicate that 20-30 Mb \ 
V - ; is the mfinimal or close to the minimal physical size for a stable 

^ minichromosome.^:^ — ^^^^ — _ . _ _ _ 

5 b. TF1004G19 

Introduction of additional heterologous DNA, including DNA 
encoding a second selectable marker, hygromycin phosphotransferase, 
i.e., the hygromycin-resistance gene, and also a detectable marker, fi- 
galactosidase (i.e., encoded by the lacZ gene), into the EC3/7C5 cell line 
10 and growth under selective conditions produced ceils designated 

TFT004G19. In particular, this cell line was produced from the EC3/7C5 
cell line by cotransfection with plasmids pH 132, which contains an anti- 
HIV ribozyme and hygromycin-resistance gene, pCH1 10 [encodes fi- 
galactosidase] and A phage Mcl 875 Sam 7] DNA and selection with 
15 hygromycin B. 

Detailed analysis of the TF1 004G1 9 cell line by in situ 
hybridization with A phage and plasmid DNA sequences revealed the 
formation of the sausage chromosome. The formerly dicentric 
chromosome of the EC3/7C5 cell line translocated to the end of another 
20 acrocentric chromosome. The heterologous DNA integrated into the 

pericentric heterochromatin of the formerly dicentric chromosome and is 
amplified several times with megabases of mouse pericentric 
heterochromatic satellite DNA sequences (Fig. 2D] forming the "sausage" 
chromosome. Subsequently the acrocentric mouse chromosome was 
25 substituted by a euchromatic telomere. 

in situ hybridization with biotin-labeled subfragments of the 
hygromycin-resistance and jff-galactosidase genes resulted in a 
— hybridization signal only in the heterochromatic arm of~the sausag 
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chromosome, indicating that in TF1 004G1 9 transformant. cells these 
genes are localized iri the pericentric heterochromatih. . ^ 

A high level of gene expression, however, was detected. In 
general, heterochromatin has a silencing effect in Drosophila, yeast and 
5 on the HSV-tk gene introduced into satellite DNA at the mouse 
centromere. Thus, it was of interest to study the TF1004G19 
transformed cell line to confirm that genes located in the heterochromatin 
were indeed expressed,, contrary to recognized dogma. 

For this purpose, subclones of TF1004G19, containing a different 

10 sausage chromosome [see Figure 2D], were established by single cell 

cloning. Southern hybridization of DNA isolated from the subclones with 
subfragments of hygromycin phosphotransferase and lacZ genes showed 
a close correlation between the intensity of hybridization and the length * 
of the sausage chromosome. This finding supports the conclusion that . 

15 these genes are localized in the heterochromatic arm of the sausage 

chromosome. > 

(1) TF1004G-19C5 
TF1 004G-1 9C5 is a mouse LMTK* fibroblast cell line containing 
neo-minichromosomes and stable "sausage" chromosomes. It is a 

20 subclone of TF1004G1 9 and was generated by single-cell cloning of the 
TF1004G19 cell line. It has been deposited with the ECACC as an 
exemplary cell line and exemplary source of a sausage chromosome. 
Subsequent fusion of this cell line with CHO K20 cells and. selection with 
hygromycin and G418 and HAT (hypoxanthine, aminopteria, and 

25 thymidine medium; see Szybalski et aL (1 962) Proc. Natl. Acad. Sci. 
48:2026) resulted in hybrid cells (designated 19C5xHa4) that carry the 
sausage chromosome and the neo-minichromosome. BrdU treatment of 
the hybrid cells, followed by single cell cloning and selection with G418 
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and/or hygromycin produced various cells that carry chromosomes of 
interest, including GB43 and G3D5, 

(2) other subclones 
. Cell lines^GB43 arfd-G3D5 v^^e ob1^inedl,y^ 

5 ce"- with BrdU followed by growth in G418-contalning selective medium 
and retreatment with BrdU. The two cell lines were isolated by single 

cell cloning of the selected cells. GB43 cells contain the neo- 
minichromosome only. G3D5, which has been deposited with the 

ECACC, carries the neo-minichromosome and the megachromosome 
10 S.ngle cell cloning of this cell line followed by growth of the subclones in 
G41 8- and hygromycin-containing medium yielded subclones such as the 
GHB42 cell line carrying the neo-minichromosome and the 
megachromosome. H 1 D3 is a mouse-hamster hybrid cell line carrying 
the megachromosome, but no neo-minichromosome; and was generated 
by treating 1 9C5xHa4 cells with BrdU followed by growth in hygromycin- 
containing selective medium and single cell subcloning of selected cells 
Fusion of this cell line with the CD4^ HeLa cell line that also carrie.. DNA 
encoding an additional selection gene, the neomycin-resistance gene, 
produced cells [designated HI XHE41 cells] that carry the 
20 megachromosome as well as a human chromosome that carries CD4neo 
Further BrdU treatment and single cell cloning produced cell lines, such 
as 1 B3, that include cells with a truncated megachromosome. 
5. DMA constructs used to transform the cells 
Heterologous DNA can be introduced Into the cells by transfection 
25 or other suitable method at any stage during preparation of the 

chromosomes [see, e^, FIG. 4J. In general, incorporation of such DNA 
into the MACS is assured through site-directed integration, such as may 
be accomplished by inclusion^ of .l-DNA in the h t rologous DNA (for the 
ex mplified chromosomes), and also an additional selective mark r gene. 
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Fpr example, cells containing a. MAC, such as the minichronnosome or a 
SATAC, can be cotransfected with a plasmid carrying the desired 
heterologous DNA, such as DNA encoding an HIV ribozyme, the cystic 
fibrosis gene, and DNA encoding a second selectable marker, such as 
5 hygromycin resistance. Selective pressure is then applied to the cells by 
exposing them to an agent that is harmful to cells that do not express 
the new selectable marker. In this manner, cells that include the 
heterologous DNA in the MAC are identified. Fusion with a second cell 
line can provide a means to produce cell lines that contain one particular 

10 type of chromosomal structure or MAC. . v> 

Various vectors for this purpose are provided herein [see. 
Examples] and others can be readily constructed. The vectors preferably 
include DNA that is homologous to DNA contained within a MAC in order 
to target the DNA to the MAC for integration therein. The vectors aisp 

15 include a selectable marker gene and the selected heterologous gene(s) 
of interest. Based on the disclosure herein and the knowledge of the 
skilled artisan, one of skill can construct such vectors. 

Of particular interest herein is the vector pTEMPUD and derivativ s 
thereof that can target DNA into the heterochromatic region of selected 

20 chromosomes. These vectors can also serve as fragmentation vectors 
[see, e.g. . Example 12]. 

Heterologous genes of interest include any gene that encodes a 
therapeutic product and DNA encoding gene products of interest. These 
genes and DNA include, but are not limited to: the cystjc fibrosis gene 

25 [CF], the cystic fibrosis transmembrane regulator (CFTR) gene [see, e.g. , 
U.S. Patent No. 5,240,846; Rosenfeld et aL (1992) Cell 68:143-155; 
Hyde et al. (1993) Nature 362 : 250-255; Kerem et al. (1989) Science 
245:1073-1080; Riordan et aL(1989) Science 245:1066-1072; 
Rommens et aL (1 989) Science 245:1059-1065; Osborne et aL (1991) 
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Am. J. Hum. fiepRTirs 48: 6089-6 122; White et aL (1 990) Nature ■ 
■ X ^344:665.667; Dean>i ilL (1990) £eH 61:863.870;-Erljch ej iL (1991) 
Scier^ and U S. Patent Nos. 5;453,357/5,449:604: 

:L -5.434,086,-and-5r240r846r which provides-aret^virar^^^^^^ 
5 the normal CFTR genej. 

B. Isolation of artificial chromosomes 

The MACS provided herein can be isolated by any suitable method 
known to those of skill in the art. Also, methods are provided herein for 
effecting substantial purification, particularly of the SATACs. SATACs 
10 have been isolated by fluorescence-activated cell sorting [FACSJ. This 
method takes advantage of the nucleotide base content of the SATACs, 
which, by virtue of their high heterochromatic DNA content, will differ 
from any other chromosomes in a cell. In particular embodiment, 
metaphase chromosorhes are isolated and stained with base-specific 
dyes, such as Hoechst 33258 and chromomycin A3. Fluorescence- 
activated cell sorting will separate the SATACs from the endogenous 
chromosomes. A dual-laser cell sorter [FACS Vantage Becton Dickinson 
Immunocytometry Systems] in which two lasers were set to excite the 
dyes separately, allowed a bivariate analysis of the chromosomes by 
base-pair composition and size. Cells containing such SATACs can be 
similarly sorted. 

Additional methods provided herein for isolation of artificial 
chromosomes from endogenous chromosomes include procedures that 
are particularly well suited for large-scale isolation of artificial 
chromosohies such as SATACs. In these methods, the size and density 
differences between SATACs and endogenous chromosomes are 
exploited to effect separation of these two types of chromosomes. Such 
m thods involv t chniques such as swinging bucket centrifug^^^^^^^ 
zonal rotor c ntrifugation, and velocity sedimentation/ Affinity-, 
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particularly immunoaffinity-, based methods for separation of artificial 
from endogenous chromosomes are also provided herein; For example, 
SATACs, which are predominantly heterochromatin, may be separated 
from endogenous chrornosomes through immunoaffinity procedures 
5 involving antibodies that specifically recognize heterochromatin, and/or 
the proteins associated therewith, when the endogenous chromosomes 
contain relatively little heterochromatin, such as in hamster cells. 
C. tn vitro construction of artificial chromosomes - 

Artificial chromosomes can be constructed in yitrg by assembling 

10 the structural and functional elements that contribute to a complete 
chromosome capable of stable replication and segregation alongside . 
endogenous chromosomes in cells. The identification of the discrete 
elements that in combination yield a functional chromosome has made 
possible the in vitro generation of artificial chromosomes. The process of 

15 in vitro construction of artificial chromosomes, which can be rigidly 

controlled, provides advantages that may be desired in the generation of 
chromosomes that, for example, are required in large amounts or that are 
intended for specific use in transgenic animal systems. 

For example, in vitro construction may be advantageous when 

20 efficiency of time and scale are important considerations in the 

preparation of artificial chromosomes. Because in vitro construction 
methods do not involve extensive cell culture procedures, they may be 
utilized when the time and labor required to transform, feed, cultivate, 
and harvest cells used in in vivo cell-based production systems is 

25 unavailable. 

in vitro construction may also be rigorously controlled with respect 
to the exact manner in which the several elements of the desired artificial 
chromosome are combined and in what sequence and proportions they 
are assembled to yield a chromosome of precise specifications. These 
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aspects may be of significance In the pr^^^ : v 

chromosomes that will be -ed in live animals where it t desirable to be ' 
certain that only very pure and specific DNA sequences in specific ' ^ 
— - -amounts are being-introduced into the host anrm~ai."' ^ " " " " " 7 " - 
5 The following describes the processes involved in the construction 

of artificial chromosomes in yiiro, utilizing a megachromosome as T 
exemplary starting material. I 

10 The MACS provided herein, particularly the SATACs, a ra elegantly ' 

s-mple chromosomes for use in the identification and isolation of : 
; components to be used in the in vitro construction of artificial 

Chromosomes. The ability to purify MACs to a very high level of purity 
; . as descnbed herein, faciUtates their use for these purposes: For ' 
15 example, the megachromosome, particularly truncated forn^s thereof ie. 
cell lines, such as 1B3 and mM2C1, which are derived from H1D3 
(deposited at the European Collection of Animal Cell Culture (ECACC) 
under Accession No. 96040929, see EXAMPLES below) serve as starting 



. .materials. 

20 



For example, the mM2Cl cell line contains a micro- ' 
megachromosome (-.50-60 kB), Which advantageous^ 
- , centromere, two regions of integrated heterologous DNA with adjacent : 
rDNA sequences, with the remainder of the chromosomal DNA being 
mouse major satellite.DNA. Other truncated megachromosomes can 
25 serve as a source of telomeres, or telomeres can be provided (see, : 
Examples below regarding construction of plasmids containing tan'demly 
repeated telomeric sequences). The centromere of the mM2C1 cell line 
: c^^ s a useful tag W 

isolation of the centromeric DNA. 
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Additional features of particular SATACs provided herein, such as 
the micro-megachromosome of the mM2C1 cell line, that make them 
uniquely suited to serve as starting materials in the isolation and 
identification of chromosomal components include the fact that the 
5 centromeres of each megachromosome within a single specific cell line 
are identical. The ability to begin with a homogeneous centromere 
source (as opposed to a mixture of different chromosomes having 
differing centromeric sequences) greatly facilitates the cloning of the 
centromere DNA. By digesting purified megachromosomes, particularly 

10 truncated megachromosomes, such as the micro-megachromosome, with 
appropriate restriction endonucleases and cloning the fragments into the 
commercially available and well known YAC vectors (see, e.g. . Burke et 
aL (1 987) Science 236 :806-81 2), BAG vectors (see, e.g. . Shizuya et aL 
(1992) Proc. Natl. Acad. Sci. U.S.A. 89 : 8794^8797 bacterial artificial 

15 chromosomes which have a capacity of incorporating 0.9 - 1 Mb of DNA) 
or PAC vectors (the PI artificial chromosome vector which is a PI ^: 
plasmid derivative that has a capacity of incorporating 300 kb of DNA 
and that is delivered to coli host cells by electroporation rather than by 
bacteriophage packaging; see, e.g. . loannou et aL (1994) Nature 

20 Genetics 6:84-89; Pierce et aL (1 992) Meth. EnzvmbL 216:549-574; 
Pierce et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 89:2056-2060; U.S. 
Patent No. 5,300,431 and International PCT application No. 
WO 92/14819) vectors, it is possible for as few as 50 clones to 
represent the entire micro-megachromosome. 

25 a. Centromeres 

An exemplary centromere for use in the construction of a 
mammalian artificial chromosome is that contained within the 
megachromosome of any of the megachromosome-containing cell lines 
provided herein, such as, for example, H1D3 and derivatives thereof. 
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t ;:; such as mM2Cl cells. Megachromosomes are isoiated from such cell 
^ lines utilizing, for example, the procedures described herein, and the 

- centromeric sequence is extracted from the isolated megachfomosomesJ 
_ _ -For- exaniple7 the-megachromosomes-may be separated "into fraTgmem^ " ^ 
5 utilizing selected restriction endonucleases that recognize and cut at sites 
that, for instance, are primarily located in the replication and/or 
heterologous DNA integration sites and/or in the satellite DNA. Based on 
the sizes of the resulting fragments, certain undesired elements may be 
separated from the centromere-containing sequences. The centromere- 
'10 containing DNA, which could be as large as 1 Mb. 

Probes that specifically recognize the centromeric sequences, such 
as mouse minor satellite DNA-based probes [see, e.g. . Wong et ah 
, (1 988) Nucl. Acids Res. 16:1 1 645-1 1 661 ], may be used to isolate the 
centromere-containing YAC, BAC or PAC clones derived from the 
15 megachromosome. Alternatively, or in conjunction with the direct 

identification of centromere-containing megachromosomal DNA, probes 
that specifically recognize the non-centromeric elements, such as probes 
specific for mouse major satellite DNA, the heterologous DNA and/or 
rDNA, may be used to identify and eliminiate the non-centromeric DNA- 
20 containing clones. 

Additionally, centromere cloning methods described herein may be 
utilized to isolate the centromere-containing sequence of the 
rriegachromosome. For example. Example 12 describes the use of YAC 
vectors in combination with the murine tyrosinase gene and NMRI/Han 
25 mice for identification of the centromeric sequence. 

Once the centromere fragment has been isolated, it may be 
sequenced and the sequence information may in turn be used in PGR 
- -amplification of centromere sequences' from m gachror^ 

sources of centrom res. Isolated centromeres may also be tested for 
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f unction in vivo by transferring the ON A into a host mammalian celL 
Functional analysis may include, for example, examining the ability of th^ 
centromere sequence to bind centromere-binding proteins. The cloned 
centromere will be transferred to mammalian cells with a selectable 
5 marker gene and the binding of a centromere-specific protein, such as 
anti-centromere antibodies ( e.g. , LU851, see, Hadlaczky ajU (1986) 
ExD. Cell Res. 167 :1-15) can be used to assess function of the 

>^--:'i',,centromeres.-'"; . . .:v-'^-- "^^ ^ ..■;,:...v^_^. ■ . _ 

b. Telomeres . I ::^:./';- , 

10 r V the 1 kS synthetic telomere provided ^ 

herein (see. Examples). A double synthetic telomere construct, which 
contains a 1 kB synthetic telomere linked to a dominant selectable 
marker gene that continues in an inverted orientation may be used for 
ease of manipulation. Such a double construct contains a series of 

15 TTAGGG repeats 3' of the marker gene and a series of repeats of the . 
. inverted sequence, i.e., GGGATT, 5' of the marker gene as follows: 
(GGGA1TT)„-"dominant marker gene--(TTAGGG)n.^- Using an inverted 
marker provides an easy means fpr insertion, such as by blunt end 
ligation, since only properly oriented fragments. will be selected. 

20 c. MegarepUcator 

The megareplicator sequences, such as the rDNA, provided herein 
are preferred for use in in vitro constructs. The rDNA provides an origin 
of replication and also provides sequences that facilitate amplification of 
the artificial chromosome in vivo to increase the size of the chromosome 

25 to, for example accommodate increasing copies of a heterologous gene 
of interest as well as continuous high levels of expression of the 
heterologous genes. ^ '^--' ■ ■\',.i-.\y,.--^\r^--'-^-:^- -'k- 





''c;r-r.' : d. ■ ■"' Filfer heterochrbmatin ■ 

- Filler heterochromatin/particulart^ satellite DNIA, is included to 
"^s'^tai" structural integrity and stability of the artificial chromosorrie and 
- - - provide-a structural base-for-cafryihg ^eines withih^tlTe clKornosom^^ 
5 satellite DNA is typically A/T-rich DNA sequence, such as mouse major 
satellite DNA, or G/C-rich DNA sequence, such as hamster natural 
satellite DNA. Sources of such DMA include any eukaryotic organisms 
that carry non-coding satellite DNA with sufficient A/T or G/C 
composition to promote ready separation by sequence, such as by FACS, 
10 or by density gradients. The satellite DNA may also be synthesized by 
generating sequence containing monotone, tandem repeats of highly A/T- 
or G/C-rich DNA units. ^ ^ ^ ^ ^ ^ ^ ^ ^ ^^^^^^ : - 

The most suitable amount of filler heterochromatin for use in 
construction of the artificial chromosome may be empirically determined 

15 by, for example, including segments of various lengths, increasing in 
size, in the construction process. Fragments that are too small to be 
suitable for use will not provide for a functional chromosome, which may 
be evaluated in cell-based expression studies, or wiir resuit in a 
chromosome of limited functional lifetime or mitotic arid structural ' 

20 '' stability. • ■ v;.-. .- -v.-;. .: . - 

e. Selectable marker 

Any convenient selectable marker may be used and at any 
convenient locus in the MAC. 

2. Combination of the isolated chromosomal elements 
25 Once the isolated elements are obtained, they may be combined 

to generate the complete, functional artificial chromosome. This 
assembly can be accomplish d for exampi , bv in vitro ligation either in 
solutiony LMP aprdse or oh rnicrbbeads. The ngation is conducted so 
that one nd of the centromer is directly joined to a telomere. The oth r 
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end of the centromere, which serves as the gene-carrying chromosome 
arm, is built up from a combination of satellite DNA and rDNA sequence 
and may also contain a.selectable marker gene. Another telomere is 
joined to the end of the gene-carrying chromosome arm; The gene- 
5 carrying arm is the site at which any heterologous genes of interest, for 
example, in expression of desired proteins encoded thereby^ are 
incorporated either during in vitro construction of the chromosome or 
sometime thereafter. 

3, Analysis and testing of the artificial chromosome 

10 Artificial chromosomes constructed in vitro may be tested for 

functionality in in vivo mammalian cell systems, using any of the 
methods described herein for the SATACs, minichromosomes, or known 
to those of skill in the art. 

4. Introduction of desired heterologous DNA into the in vitro 
15 synthesized chromosome 

. Heterologous DNA may be introduced into the in vitro synthesized 

chromosome using routine methods of molecular biology, may be ^ 

introduced using the methods described herein for the SATACs, or may 

be incorporated into the in vitro synthesized chromosonrie as part of one 

20 of the synthetic elements, such as the heterochromatin. The 

heterologous DNA may be linked to a selected repeated fragment, and 

then the resulting construct may be amplified in vitro using the methods 

for such in vitro amplification provided herein (see the Examples). 

D. Introduction of artificial chromosomes into cells, tissues, animals 
25 and plants 

Suitable hosts for introduction of the MACs provided herein, 

include, but are not limited to, any animal or plant, cell or tissue thereof, 

including, but not limited to: mammals, birds, reptiles, amphibians, 

insects, fish, arachnids, tobacco, tomato, wheat, plants and algae. The 

30 MACs, if contained in cells, may be introduced by cell fusion or microcell 
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MACS have been Isolated from cells, they may be 
introduced into host cells by any method known to those of skill in this 
art, including but not limited to: direct DNA transfer, electroporation, 
„. lipid-mediated-transfer, erar,-lipofection-and-|iposom^^^ 
5 bombardment, microinjection in cells and embryos, protoplast 
regeneration for plants, and any other suitable method [see/ e.g.. 
Weissbach et aL (1988) Methods for Plant Molecular Biology, Academic 
Press, N.Y., Section VIII, pp. 421-463; Grierson et aL (1988) Plant 
Molecular Biology, 2d Ed., Blackie, London, Ch. 7-9; see, also U.S. 
10 Patent Nos. 5,491,075; 5,482,928; and 5,424,409; see, also, e^, U.S. 
Patent No. 5,470,708, which describes particle-mediated transformation 
of mammalian unattached cells]. 

Other methods for introducing DNA into cells include nuclear 
microinjection and bacterial protoplast fusion with intact cells. 
15 Polycations, such as polybrene and polyomithine, may also be used. For 
various techniques for transforming mammalian cells, see e^, Keown _t 
aL Methods in Enzymoloqy (1990) Vol. 185, pp. 527-537; and Mansour 
ei aL (1 988) jSlatyre 336:348-352. 

For example, isolated, purified artificial chromosomes can be 
20 injected into an embryonic cell line such as a human kidney primary 
embryonic cell line lATCC accession number CRL 1 573] or embryonic 
stem cells [see, e^, Hogan et aL (1 994y Man/pu/at/ng the Mouse 
Embryo. A .Laboratory Manual, Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY, see, especially, pages 255-264 and 
25 Appendix 3J. 

Preferably the chromosomes are introduced by microinjection, 
using a system such as the Eppendorf automated microinjection syst m; 
^"^^ Si'own under selective conditions, such as ih th pres nee of 
hygromycin B or neomycin. 
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1 . Methods for introduction of chromosomes into hosts 
Depending on the host cell used, transformation is done using 
standard techniques appropriate to such cells. These methods include 
any, including those described herein, known to those of skill in the art, 
5 a. DNA uptake 

For mammalian cells that do not have cell walls, the calcium 
phosphate precipitation method for introduction of exogenous DNA [see, 
e.g. , Graham et aL_ (1 978) Virology 52:456-457; Wigler et aL (1 979) 
Proc. Natl. Acad. Sci. U.S.A. 76:1373-1376; and Current Protocols in 

10 Molecular Bioloov, Vol. 1 , Wiley Inter-Science, Supplement 14, Unit 

9.1.1-9,1-9 (1990)] is often preferred. DNA uptake can be accomplished 
by DNA alone or in the presence of polyethylene glycol [PEG-mediated 
gene transfer], which is a fusion agent, or by any variations of such 
methods known to those of skill in the art (see, e.g. , U.S. Pat. No. 

15 4,684,611]. ^ 
Lipid-mediated carrier systems are also among the preferred j 
methods for introduction of DNA into cells [see, e.g., Teifel et sL (1 995) 
Biotechniaues 19 :79-80: Albrecht et aL (1 996) Ann. Hematol. 72:73-79; 
Holmen et aL (1 995) In Vitro Cell Dev. Biol. Anim. 31 :347-351 : Remy et 

20 al. (1994) Bioconiuo. Chem. 5:647-654: Le Bolc'h et aL (1995) . 

Tetrahedron Lett. 36 :6681-6684: Loeffler et aL (1993) Meth. EnzvmoL 
217:599-618]. Lipofection [see, e.g. , Strauss (1996) Meth. Mol. Biol. 
54:307-327] may also be used to introduce DNA into cells. This method 
is particularly well-suited for transfer of exogenous DNA into chicken 

25 cells ( e.g. . chicken blastodermal cells and primary chicken fibroblasts; 
see Brazolot et aL (1991) MoL Reoro. Dev. 30 :304-312). In particular, 
DNA of interest can be introduced into chickens in operative linkage with 
promoters from genes, such as lysozyme and ovalbumin, that are 
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^ ' expressed in the egg, thereby permitting expression of the heterologous 
DNA in the egg. 

; Additional methods useful In the direct transfer of DNA into cells 
_ - ^ Jnclude-particle gun electrofusion [see,- e.Q.- . U.S. Patent Nos."4,955,378r 
5 4,923,814, 4,476,004, 4,906,576 and 4,441,9721 and virion-mediated 
gene transfer. 

A commonly used approach for gene transfer in land plants involves 
the direct introduction of purified DNA into protoplasts. The three basic 
methods for direct gene transfer into plant cells include: 1 ) polyethylene 

10 glycol (PEG]-mediated DNA uptake, 2) electroporation-mediated DNA 
uptake and 3) microinjection. In addition, plants may be transformed 
using ultrasound treatment [see, e.g. . International PCT application 
publication No. WO 91-/00358]. 

b. Electroporatton 

15 Electroporation involves providing high-voltage electrical pulses to 

a solution containing a mixture of protoplasts and foreign DNA to create 
reversible pores in the membranes of plant protoplasts as well as other 
cells. Electroporation is generally, used for prokaryotes or other cells, 
such as plants that contain substantial cell-wall barriers. Methods for 

20 effecting electroporation are well known [see, e.g. . U.S. Patent Nos. 
4,784,737, 5,501,967, 5,501,662, 5,019,034, 5,503,999; see, also 
Frommet al. (1985) Proc. Natl. Acad. Sci. U.S.A. 82:5824-58281. 

For example, electroporation is often used for transformation of 
plants [see, e.g. , Ag Biotechnoloov News 7:3 and 17 

25 (September/October 1990)]. In this technique, plant protoplasts are 

electroporated in the presence of the DNA of interest that also includes a 
phenotypic marker. Electrical impulses of high field strength reversibly 
p rmeabilize biom mbranes allowing the introduction of the plasmids: 
Electroporat d plant protoplasts reform the cell wall, divide, and form 
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plant callus. Transformed plant cells will be identified by virtue of the 
expressed phenotypic marker. The exogenous DNA may be added to the 
protoplasts in any form such as, for example, naked linear, circular or 
supercoiled DNA, DNA encapsulated in liposomes, DNA in spheroplasts, 
5 DNA in other plant protoplasts, DNA complexed with salts, and other 
methods. 

c- -Microcells 
The chromosomes can be transferred by preparing microcells 
containing an artificial chromosome and then fusing with selected target 
10 cells. Methods for such preparation and fusion of microcells are well 
known [see the Examples and also see, e.g. . U.S. Patent Nos. 
5,240,840, 4,806,476, 5,298,429, 5,396,767, Fournier (1981) Proc. 
Natl. Acad. Sci. U.S.A. 78 :6349-6353: and Lambert et aL (1 991 ) Prioc^ 
Natl. Acad. Sci. U.S.A. 88 :5907-591. Microcell fusion, using microcells 
15 that contain an artificial chromosome, is a particularly useful method for 
introduction of MACs into avian cells, such as DT40 chicken pre-B cells 
[for a description of DT40 cell fusion, see, e.g. , Dieken et aL (1 996) 
Nature Genet. 12 :174-1821. 
2. Hosts 

20 Suitable hosts include any host known to be useful for introduction 

and expression of heterologous DNA. Of particular; interest herein, 
animal and plant cells and tissues, including, but not limited to insect 
cells and larvae, plants, and animals, particularly transgenic (non-human) 
animals, and animal cells. Other hosts include, but are not limited to 

25 mammals, birds, particularly fowl such as chickens, reptiles, amphibians, 
insects, fish, arachnids, tobacco, tomato, wheat, monocots, dicots and 
algae, and any host into which introduction of heterologous DNA is 
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desired. Such introduction can be effected using the MACs provided 

herein, or, if necessary by using the MACs provided herein to identify 

. species-specific centromeres and/or functional chromosomal units and 

^then .using_the_resulting centromeres_or ciiromospmal_units_ as artificial „ 

5 chromosomes, or alternatively, using the methods exemplified herein for 

production of MACs to produce species-specific artificial chromosomes, 

a. Introduction of DNA into embryos for production of 
transgenic (non-human) animals and introduction of 
DNA into animal cells 

10 Transgenic (non-human) animals can be produced by introducing 

exogenous genetic material into a pronucleus of a mammalian zygote by 

microinjection [see, e.g. , U.S. Patent Nos. 4,873,191 and 5,354,674; 

see, also. International PCT application publication No. WO 95/14769, 

which is based on U.S. application Serial No. 08/1 59,084]. The zygote 

15 is capable of development into a mammal. The embryo or zygote is 

transplanted into a host female uterus and allowed to develop. Detailed 

protocols and examples are set forth below. 

Nuclear transfer [see. Wilmut et al. (1997) Nature 385 :810-813, 

International PCT application Nos. WO 97/07669 and WO 97/07668]. 

20 Briefly in this method, the SATAC containing the genes of interest is 

introduced by any suitable method, into an appropriate donor cell, such 

as a mammary gland cell, that contains totipotent nuclei. The diploid 

nucleus of the cell, which is either in GO or G1 phase, is then introduced, 

such as by cell fusion or microinjection, into an unactivated oocyte, 

25 preferably enucleated cell, which is arrested in the metaphase of the 

second meiotic division. Enucleation may be effected by any suitable 

method, such as actual removal, or by treating with means, such as 

ultraviolet light, that functionally remove the nucleus. The oocyte is then 

activated, preferably after a period of contact, abom 6-2^^ 

30 cattle, of the new nucleus with the cytoplasm, while maintaining correct 
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ploidy, to produce a reconstituted embryo, which is then introduced into v: 
a host. Ploidy is nnaintained during activation, for example, by incubating 
the reconstituted cell in the presence of a microtubule irihibitor, such as 
nocodazole, colchicine, cocemid, and taxol, .whereby the DNA replicates 
5 once. 

Transgenic chickens can be produced by injection of dispersed 
blastodermal cells from Stage X chicken embryos into recipient enribryos 
at a similar stage of development [see e.g. . Etches et aL (1 993) Poultry 
Sci. 72 :882-889: Petitte et al. (1990) Development 108 :185-1891. 

10 Heterologous DNA is first introduced into the donor blastodermal cells 

using methods such as, for example, lipofection [see, e.g. . Brazolot et al. 
(1991) Mol. Reoro. Dev. 30 :304-3121 or microcell fusion [see, e.g. , 
Dieken et aL (1996) Nature Genet. 12 :174-1821. The transfected donor 
cells are then injected into recipient chicken embryos [see e.g. , Carsience 

15 et aL (1 993) Development 117 : 669-6751. The recipient chicken 

embryos within the shell are candled and allowed to hatch to yield a ^ 
germline chimeric chicken. 

DNA can be introduced into animal cells using any known 
procedure, including, but not limited to: direct uptake, incubation with 

20 polyethylene glycol [PEG], microinjection, electroporation, lipofection, cell 
fusion, microcell fusion, particle bombardment, including microprojectile 
bombardment [see, e.g. , U.S. Patent No. 5,470,708, which provides a 
method for transforming unattached mammalian cells via particle 
bombardment], and any other such method. For example, the transfer of 

25 plasmid DNA in liposomes directly to human cells in situ has been 

approved by the FDA for use in humans [see, e.g. , Nabel, et aL (1990) 
Science 249 :1285-1288 and U.S. Patent No. 5,461,032]. 



Introducttorv of heterologous DNA into plants 

Numerous methods for producing or developing transgenic plants 
are available to those of skill in the art. The method used is primariiy a 
- - ^function of the species of-plant. These methods include; but are not - - " 
5 limited to: direct transfer of DNA by processes, such as PEG-induced 
DNA uptake, protoplast fusion, microinjection, electroporation, and 
microprojectile bombardment [see> e.g. : Uchimiya et aL (1 989) J. of 
Biotech. 12: 1-20 for a review of such procedures, see, also, e.g. . U.S. 
Patent Nos. 5,436,392 and 5,489,520 and many others]. For purposes 
10 herein, when introducing a MAC, microinjection, protoplast fusion and 
particle gun bombardment are preferred. 

Plant species, including tobacco, rice, maize, rye, soybean, 
Brassica naous , cotton, lettuce, potato and tomato, have been used to 
produce transgenic plants. Tobacco and other species, such as petunias, 
15 often serve as experimental models in which the methods have been 
developed and the genes first introduced and expressed. 

DNA uptake can be accomplished by DNA alone or in the presence 
of PEG, which is a fusion agent, with plant protoplasts or by any 
variations of such methods known to those of skill in the art fsee/ e.g. / 
20 U.S. Patent No. 4,684,61 1 to Schllperoot et aL]. Electroporation, which 
involves high-voltage electrical pulses to a solution containing a mixture 
: of protoplasts and foreign DNA to create reversible pores, has been used, 
for example, to successfully introduce foreign genes into rice and 
Brassica napus . Microinjection of DNA into plant cells, including cultured 
25 cells and cells in intact plant organs and embryoids in tissue culture and 
microprojectile bombardment [acceleration of small high density particles, 
which contain the DNA, to high velocity with a particle gun apparatus, 
which forces the particles to penetrate plant cell walls and membranesl 
have also b en used. All plant cells into which DNA can be introduced 
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and that can be regenerated from the transformed cells can be used to 
produce transformed whole plants which contain the transferred artificial 
chromosome. The particular protocol and means for introduction of the 
DNA into the plant host may need to be adapted or refined to suit the 
5 particular plant species or cultivar. 

c. Insect cells 
Insects are useful hosts for introduction of artificial chromosomes 
for numerous reasons, including, but not limited to: (a) amplification of 
genes encoding useful proteins can be accomplished in the artificial 

10 chromosome to obtain higher protein yields in insect cells; (b) insect cells 
support required post-translational modifications, such as giycosylation 
and phosphorylation, that can be required for protein biological 
functioning; (c) insect cells do not support mammalian viruses, and, thus, 
eliminate the problem of cross-contamination of products with such 

15 infectious agents; (d) this technology circumvents traditional recombinant 
baculovirus systems for production of nutritional, industrial or medicinal 
proteins in insect cell systems; (e) the low temperature optimum for 
insect cell growth (28° C) permits reduced energy cost of production; (f) 
serum-free growth mediurri for insect cells permits lower production 

20 costs; (g) artificial chromosome-containing cells can be stored indefinitely 
at low temperature; and (h) insect larvae will be biological factories for 
production of nutritional, medicinal or industrial proteins by microinjection 
of fertilized insect eggs [see, e.g. . Joy et aL (1991) Current Science 
66:145-150, which provides a method for microinjecting heterologous 

25 DNA into fio/77i6yx/77or/ eggs]. 

. Either MACs or insect-specific artificial chromosomes [BUGACsl 
will be used to introduce genes into insects. As described in the 
Examples, it appears that MACs wilLfunction in ins cts to direct 
xpression of heterologous DNA contained thereon. For xample, as 
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" : described in the Exjamples, a MAC containing the B. mori actin gene 

pronnoter fused to the lacZ gene has been generated by transfection of 
, EC3/7C5 cells with a plasmid containing the fusion gene. Subsequent 

„ j_ ^ fusion of jthe„£f. mor/ jc_eils_with_t^ EC_3A7C_5 jcelis__ that_ _ ^ „ 

5 survived selection yielded a MAC-containing insect-mouse hybrid cell line 
in which )?-galactosidase expression was detectable. 

Insect host cells include,, but are not limited to, hosts such as 
Spodoptera frugiperda [caterpillar], Aedes aegypti [mosquitol, vAec/es 
albopictus [mosquito], Drosphi/a me/anogaster Ifruitfly], Bombyx mori 
10 [silkworm], Manduca sexfa [tomato horn worm] and TrichoplUsia ni 
[cabbage looperj. Efforts have been directed toward propagation of 
insect cells in culture. Such efforts have focused on the fall armyworm, 
Spodoptera frugiperda. Cell lines have been developed also from other 
insects such as the cabbage looper, TrichoplUsia r^i and the silkworm, 
15 Bombyx mori. It has also been suggested that analogouis cell lines can 
< be created using the tomato horn worm, Manduca sexta. To introduce 
DNA into an insect, it should be introduced into the larvae, and allowed 
to proliferate, and then the hemolymph recovered from the larvae so that 
x the proteins cian be isolated therefrom. 
20 ^ ^ The preferred method herein for introduction of artificial 

chromosomes into insect cells is microinjection [see, e.g. . Tamura et aL 
. (1 991 ) Bio Ind. 8:26-31 ; Nikolaev et aL (1 989) Mol. Biol. (Moscow) 
23:1 177-87; and methods exemplified and discussed herein]. 
E. Applications for and Uses of Artificial chromosomes 
25 Artificial chromosomes provide convenient and useful vectors, and 

in some instances l e.Q. , in the case of very large heterologous genes] the 
only vectors, for introduction of heterologous genes into hosts. Virtually 
^ .any gene of interest is amenable to introduction into a host via arti^^^ ^ 
- chromosomes. Such genes include, but are not limited to, genes that 
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encode receptors, cytokines, enzymes, proteases, hormones, growth 
factors, antibodies, tumor suppressor genes, therapeutic products and 
multigene pathways. 

The artificial chromosomes provided herein will be used in methods 
5 of protein and gene product production, particularly using insects as host 
cells for production of such products, and in cellular ( e.g. ; mammalian 
cell) production systems in which the artificial chromomsomes 
(particularly MACs) provide a reliable, stable and efficient means for 
optimizing the biomanufacturing of important compounds for medicine 
10 and industry. They are also intended for use in methods of gene therapy, 
and for production of transgenic plants and animals [discussed above, 
below and in the EXAMPLES]. 
1. Gene Therapy 

Any nucleic acid encoding a therapeutic gene product or product 
15 of a rriultigene pathway may be introduced into a host animal, such as a 
human, or into a target cell line for introduction into an animal, for i: 
therapeutic purposes. Such therapeutic purposes include, genetic 
therapy to cure or to provide gene products that are missing or defective, 
to deliver agents, such as anti-tumor agents, to targeted cells or to an 
20 animal, and to provide gene products that will confer resistance or 

reduce susceptibility to a pathogen or ameliorate symptoms of a disease 
or disorder. The following are some exemplary genes and gene products. 
Such exemplification is not intended to be limiting, 
a. Anti-HIV ribozymes 
25 As exemplified below, DNA encoding anti-HIV ribozymes can be 

introduced and expressed in cells using MACs, including the 
euchromatin-based minichromosomes and the SATACs. These MACs 
can be used to make a transgenic mouse that expresses a ribozyme and, 
thus, serves as a model for testing the activity of such ribozymes or from 
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which ribozyme-producing cell lih€^s can be rriade^^ Also^ introduction of a 
MAC that encodes an anti-HIV ribbzyme into human cells will serve as 
" treatment for HIV infection. Such systems further demonstrate the 
j: ~ l^viability of using-any disease-specific ribozyme to-treat or ameliorate- a- - - - — — 
5 particular disease. 

b. Tumor Suppressor Genes 
Tumor suppressor genies^re genes that^ in their wild-type alleles, V 
express proteins that suppress abnormal cellular proliferation. When the 
gene coding for a tumor suppressor protein is mutated or deleted, the u 
10 resulting mutant protein or the complete lack of tumor suppressor protein 
expression may result in a failure to correctly regulate cellular 
proliferation. Consequently, abnormal cellular proliferation may take 
place, particularly if there is already existing damage to the cellular 
regulatory mechanism. A number of well-studied Human tumors and 
15 tumor cell lines have been shown to have missing or nonfunctional tumor 
suppressor genes. I - ' 

Examples of tumor suppression genes include, but are not limited 
to, the retinoblastoma susceptibility gene or RB gene, the p53 gene, the 
gene that is deleted in colon carcinoma [ i.e. , the DCC gene] and the 
20 neurofibromatosis type 1 [NF-I ] tumor suppressor gene [see, e.g. , U.S. 
Patent No. 5,496,731; Weinberg et aL (1991) 254:1 1 38-1 146]. Loss of 
function or inactivation of tumor suppressor genes may play a central 
role in the initiation and/or progression of a significant number of human 
cancers. 
25 The p53 Gene 

Somatic cell mutations of the p53 gene are said to be the most 
frequent of the gene mutations associated with human cancer [see, e.g. . 
- - We^ normal or ^ - - ^ - 

wild-type p53 gene is a negative regulator of c il growth, which, when 
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damaged, favors cell transformation. The p53 expression product is 
found in the nucleus, where it may act in parallel or cooperatively with 
other gene products. Tumor cell lines in which p53 has been deleted 
have been successfully treated with wild-type p53 vector to reduce 
5 tumorlgenicity [see. Baker et aL (1990) Science 249 :912-91 51. 

DNA encoding the p53 gene and plasmids containing this DNA are 
well known [see, e.g. , U.S. Patent No. 5,260,191 ; see, also Chen et al. 
(1990) Science 250 :1576: Farrel et al. (1991) EMBO J. 10:2879-2887: 
plasmids containing the gene are available from the ATCC, and the 

10 sequence is in the GenBank Database, accession nos. X54156, X60020, 
M14695, M16494, K031991. 

c. The CFTR gene 
Cystic fibrosis [CF] is an autosomal recessive disease that affects 
epithelia of the airways, sweat glands, pancreas, and other organs, it is 

15 a lethal genetic disease associated with a defect in chloride ion transport, 
and is caused by mutations in the gene coding for the cystic fibrosis 
transmembrane conductance regulator [CFTR], a 1480 amino acid protein 
that has been associated with the expression of chloride conductance in 
a variety of eukaryotic cell types. Defects in CFTR destroy or reduce the 

20 ability of epithelial cells in the airways, sweat glands, pancreas and other 
tissues to transport chloride ions in response to cAMP-mediated agonists 
and impair activation of apical membrane channels by cAMP-dependent 
protein kinase A [PKA], Given the high incidence and devastating nature 
of this disease, development of effective CF treatments is imperative. 

25 The CFTR gene [ — 250 kbl can be transferred into a MAC for use, 

for example, in gene therapy as follows. A CF-YAC [see Green et al. 
Science 250 :94-981 may be modified to include a selectable marker, 
such as a gene encoding a protein that confers resistance to puromycin 
or hygromycin, and /i-DNA for use in site-specific int gration into a neo- 
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: minichromdsome or a SATAC. Such a modified CF-YAC can be 

Introduced into MAC-containing cells, such as EC3/7C5 or 19C5xHa4 

cells, by fusion with yeast protoplasts harboring the modified CF-YAC or 

. _ 1^ -^microinjection of yeast nuclei harboring the modified-CF^YAC into-the ~ 

5 cells. Stable trainsformants are then selected on the basis of antibiotic 

resistance. These transformants will carry the modified CF-YAC within 

the MAC contained in the cells, 

2. Animals, birds, fish and plants that are genetically altered to 
possess desired traits such as resistance to disease 

10 Artificial chromosomes are ideally suited for preparing animals, 

including vertebrates and invertebrates, including birds and fish as well 

as mammals, that possess certain desired traits, such as, for example, 

. disease resistance, resistance to harsh environmental conditions, altered 

; _ growth patterns, and enhanced physical characteristics. 

15 One example of the use of artificial chromosomes in generating 

disease-resistant organisms involves the preparation of multivalent 

vaccines. Such vaccines include genes encoding multiple antigens that 

can be carried in a MAC, or species-specific artificial chromosome, and 

either delivered to a host to induce immunity, or incorporated into 

20 embryos to produce transgenic (non-human) animals and plants that are 

immune or less susceptible to certain diseases. 

Disease-resistant animals and plants may also be prepared in 

which resistance or decreased susceptibility to disease is conferred by 

introduction into the host organism or embryo of artificial chromosomes 

25 containing DNA encoding gene products ( e.g. . ribozymes and proteins 

that are toxic to certain pathogens) that destroy or attenuate pathogens 

or limit access of pathogens to the host. 

Animals and plants possessing desired traits that might, for 

example, enhance utility, processibility and commercial value of the 
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organisms in areas such as the agricultural and ornamental plant 

industries may also be generated using artificial chromosomes in the 

same manner as described above for production of disease-resistant 

animals and plants. In such instances, the artificial chromosomes that 

5 are introduced into the organism or embryo contain DNA encoding gene 

products that serve to confer the desired trait in the organism. 

Birds, particularly fowl such as chickens, fish and crustaceans will 

serve as model hosts for production of genetically altered organisms 

using artificial chromosomes. 

10 3. Use of MACs and other artificial chromosomes for 

preparation and screening of libraries 

Since large fragments of DNA can be incorporated into each - 

artificial chromosome, the chromosomes are well-suited for use as 

cloning vehicles that can accommodate entire genomes in the preparation 

15 of genomic DNA libraries, which then can be readily screened. For - 

example, MACs may be used to prepare a genomic DNA library usefjjl in 
the identification and isolation of functional centromeric DNA from 
different species of organisms. In such applications, the MAC used to 
prepare a genomic DNA library from a particular organism is one that is 

20 not functional in cells of that organism. That is, the MAC does not 

stably replicate, segregate or provide for expression of genes contained 
within it in cells of the organism. Preferably, the MACs contain an 
indicator gene ( e.g. . the lacZ gene encodingi ^ff-galactosidase or genes 
encoding products that confer resistance to antibiotics such as 

25 neomycin, puromycin, hygromycin) linked to a promoter that is cap^able 
of promoting transcription of the indicator gene in cells of the organism. 
Fragments of genomic DNA from the organism are incorporated into the 
MACs, and the MACs are transferred to cells from the organism. Cells 
that contain MACs that have incorporated functional centromeres 
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contained within the genomic DISIA fragments are identified by detection 
of expression of the marker gene. 

4. Use of MACs and other artificial chromosomes for stable/ 
_ high-level protein production „ _ „ „ 

Cells containing the MACs and/or other artificial chromosomes 
provided herein are advantageously used for production of proteins, 
particularly several proteins from one cell line, such as multiple proteins 
involved in a biochemical pathway or multivalent vaccines. The genes 
encoding the proteins are introduced into the artificial chromosomes 
which are then Introduced Into cells. Alternatively, the heterologous 
gene(s) of interest are transferred into a production cell line that already 
contains artificial chromosomes in a manner that targets the gene(s) to 
the artificial chromosomes. The cells are cultured under conditions 
whereby the heterologous proteins are expressed- Because the proteins 
will be expressed at high levels in a stable permanent extra-genomic 
chromosomal system, selectiviB conditions are not required. 

Any transfectable cells capable of serving as recombinant hosts 
adaptable to continuous propagation in a cell culture system [see, e.g. . 
McLean (1993) Trends In Biotech^ 11:232-238] are suitable for use in an 
artificiar chromosome-based protein production system. Exemplary host 
cell lines include, but are not limited to, the following: Chinese hamster 
ovary (CHO) cells (see, e.g. . Zang et aK (1 995) Blotechnologv 13 :389- 
392], HEK 293, Ltk", COS-7, DG44, and BHK cells. CHO cells are 
particularly preferred host cells. Selection of host cell lines for use in 
artificial chromosome-based protein production systems is within the skill 
of the art, but often will depend on a variety of factors, including the 
properties of the heterologous protein to be produced, potential toxicity 

modification ( .g. . glycosylation, amination, phosphorylation) of the 
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protein, transcription factors available in the cells, the type of promoter 
element(s) being used to drive expression of the heterologous gene, 
whether production will be completely intracellular or the heterologous 
protein will preferably be secreted from the cell, and the types of 
5 processing enzymes in the cell. 

The artificial chromosome-based system for heterologous protein 
production has many advantageous features; For example, as described 
above, because the heterologous DNA is located in an independent, 
extra-genomic artificial chromosome (as opposed to randomly inserted in 
10 an unknown area of the host cell genome or located as 

extrachromosomal element(s) providing only transient expression) it is 
stably maintained in an active transcription unit and is not subject to 
ejection via recombination or elimination during cell diviision. 
Accordingly, it is unnecessary to include a selection gene in the host 
15 cells and thus growth under selective conditions is also unnecessary. ^ 
Furthermore, because the artificial chromosomes are capable of ^ 
incorporating large segments of DNA, multiple copies of the heterologous 
gene and linked promoter element(s) can be retained in the 
chromosomes, thereby providing for high-level expression of the foreign 
20 protein(s). Alternatively, multiple copies of the gene can be linked to a 
single promoter element and several different genes may be linked in a 
fused polygene complex to a single promoter for expression of, for 
example, all the key proteins constituting a complete metabolic pathway 
[see, e^. Beck von Bodman et aL (1995) Biotechnoloov 13 :587-5911. 
25 Alternatively, multiple copies of a single gene can be operatively linked to 
a single promoter, or each or one or several copies may be linked to 
different promoters or .multiple copies of the same promoter. 
Additionally, because artificial chromosomes have an almost unlimited 
capacity for integration and expression of foreign genes, they can be 



wo 97/40183 



PCT/US97^59il 



-68- 



' : ^ used not only for ttie expressibn of genes encodihg end-products of 
interest, but also for the expression of genes associated with optimal 
ntiaintenance and metabolic management of the host cell/ ; 
_ - --encodlng-growth-factorsr as-well-as-genes-that m^y-facilita-te-Tapid- ~ — 
5 synthesis of correct form of the desired heterologous protein product, 
e^, genes encoding processing enzymes and transcription factors. 
The MACS are suitable for expression of any proteins or peptides, 
including proteins and peptides that require in vivo posttranslational 
modification for their biological activity. Such proteins include, but are 
10 not limited to antibody fragments, full-length antibodies, and multimeric 
antibodies; tumor suppressor proteins, naturally occurring or 
artificial antibodies and enzymes, heat shock proteins, and others. 

Thus, such cell-based "protein factories" employing MACs can 
generated using MACs constructed with multiple copies Itheoretically an 
15 unlimited number or at least up to a number such that the resulting MAC 
is about up to the size of a genomic chromosome (Le,, endogenous)] of 
protein-encoding genes with appropriate promoters, or multiple genes 
driven by a single promoter, Le,, a fused gene complex [such as a 
complete metabolic pathway in plant expression system; see, e^. Beck 
20 von Bodman (1995) Biotechnology i r^-«^fi7-t;Qi] Once such MAC is 
constructed, it can be transferred to a suitable cell culture system, such 
as a CHO cell line in protein-free culture medium [see, e^, ( 1995) 
Biotechnology 13:389-391 or other immortalized cell lines [see, e^, 
(1993) TIBTECH 11:232-238] where continuous production can be 
established. ■ 

The ability of MACs to provide for high-level expression of 
heterologous proteins in host cells is demonstrated, for example, by 
analysis of the HI D3-and G3b5^cell lines de^^^ 

with the ECAGC. Northern blot ianalysis of mRN A obtained from these 
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cells reveals that expression of the hygromycin-resistance and 
galactosidase genes in the cells correlates with the amplicon number of 
the nnegachromosGme(s) contained therein. 

F. Methods for the synthesis of DNA sequences containing repeated 
5 DNA units 

Generally, assembly of tandemly repeated DNA poses difficulties 
such as unambiguous annealing of the complementary oligos. For 
example, separately annealed products may llgate in an inverted 

10 orientation. Additionally, tandem or inverted repeats are particularly 

susceptible to recombination and deletion events that may disrupt the ^ 
sequence. Selection of appropriate host organisms ( e.g. , rec* strains) for 
use in the cloning steps of the synthesis of sequences of tandemly 
repeated DNA units may aid in reduction and elimination of such events. 

15 Methods are provided herein for the synthesis of extended DNA, 

sequences containing repeated DNA units. These methods are 
particularly applicable to the synthesis of arrays of tandemly repeated 
DNA units, which are generally difficult or not possible to construct : 
utilizing other known gene assembly strategies. A specific use of these 

20 methods is in the synthesis of sequences of any length containing simple . 
(e.g., ranging from 2-6 nucleotides) tandem repeats (such as telomeres 
and satellite DNA repeats and trinucleotide repeats of possible clinical 
significance) as well as complex repeated DNA sequences. Ah particular 
example of the synthesis of a telomere sequence containing over 150 

25 successive repeated hexamers utilizing these methods is provided herein. 

The methods provided herein for synthesis of arrays of tandem 
DNA repeats are based in a series of extension steps in which successive 
doublings of a sequence of repeats results in an exponential expansion of 
the array of tand m repeats. These methods provide several advantages 

30 over pr viously known methods of gene assembly. For instance> the 
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starting oligonucleotides are used only once: The intermediates in, as : 
well as the firial product of, the construction of the DNA arrays described 
herein nriay be obtained in cloned form in a microbial organism (e.g.. El 
-coU-and yeast) . -Of pa-rticular sighifid^nci^^^^ 
is the fact that sequence length increases exponentially, as opposed to 
linearly, in each extension step of the procedure even though only two 
oligonucleotides are required in -the nriethods. The construction process; ? 
does not depend oh the compatibility of restriction enzyme recognition 
sequences and the sequence of the repeated DNA because restriction 
sites are used only^ tenriporarily during the assembly procedure: ; No 
adaptor is necessary, though a region of similar function is located 
between two of the restriction sites employed in the process. The only 
limitation with respect to restriction site use is that the two Sites 
employed in the method must not be present elsewhere in the vector 
15 utilized in any cloning steps. These procedures can also be used to 

construct complex repeats with perfectiy identical repeat units, such as 
the variable number tandem repeat (VNTR) 3' of the humeri ; V ; 

apolipoprotein B100 gene (a repeat unit of 30 bp, 100% AT) or alphoid 
satellite DNA. ^ : \" '.."'^ 

20 The method of synthesizing DNA sequences containing tandem repeats 
may generally be described as follows: 

"/ " 1' ■ Starting materiials" . .^ v^.j-:^ 

Two oligonucleotides are utilized as starting materials. 
Oligonucleotide 1 is of length k of repeated sequence {the flanks of 
25 which are not relevant) and contains a relatively short stretch (60-90 

nucleotides) of the repeated sequence, flanked with appropriately chosen 
restriction sites: 
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wherein SI is restriction site 1 cleaved by El [preferably an enzyme 
producing a 3'-overhang ( e.g. . Pad , Pst I, Sph I, Nsi l, etc.) or blunt-end], 
S2 is a second restriction site cleaved by E2 ( preferably an enzyme 
producing a 3'-overhang or one that cleaves outside the recognition 

5 sequence, such as Tso RI). > represents a simple repeat unit, and ' ' 

denotes a short (8-10) nucleotide flanking sequence complementary to 
oligonucleotide 2: 

3'- S3-5V 

wherein S3 is a third restriction site for enzyme E3 and which is present 
10 in the vector to be used during the construction. 

Because there is a large variety of restriction enzynries that 

recognize many different DNA sequences as cleavage sites, it should 

always be possible to select sites and enzymes (preferably those that 

yield a 3'-protruding end) suitable for these methods in connection with 
15 the synthesis of any one particular repeat arrary. In most cases, only 1 

(or perhaps 2) nucleotide(s) has of a restriction site is required to be^ 

present in the repeat sequence, and the remaining nucleotides of the 

restriction site can be removed, for example: 

PacI: TTAAT/TAA- (Klenow/dNTP) TAA- 
20 PstI: CTGCA/G- (Klenow/dNTP) G~ 

Nsi l: ATGCA/T", (Klenow/dNTP) T- 

Konl: GGTAC/C- (Klenow/dNTP) C- 

Though there is no known restriction enzyme leaving a single A 

behind, this problem can be solved with enzymes leaving behind none at 
25 all, for example: 

lail: ACGT/ (Klenow/dNTP) - 

NIalll: CATG/ (Klenow/dNTP) - 

Additionally, if mung bean nuclease is used instead of Kleno.w, then the 
following 
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Xba i: T/CTAGA Mung bean nuclease A~ „ 

FurtHermGre/ there are a number of restriction enzymes that cut outside 

J of the reGogriition-sequencer-and-in this-caser there-is no -limitation at_^ 

5 TsfiRI NNCAGTGNN/" (Klenow/dNTP) - 
Bsm I GAATG CM/- (Klenow/dNTP) - 
CTTAC/GN (Klenow/dNTP) 
2. Step 1 - Annealing 

Oligonucleotides 1 and 2 are annealed at a temperature selected 
10 depending on the length of overlap (typically in the range of 30-65 °C). 

3- Step 2 - Generating a doubie-stranded molecule 
. The annealed oligonucleotides are filled-in with Klenow polymerase 

in the presence of dNTP to produce a double-stranded (ds) sequence: 

V ' 5 f .si>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>S2 S3-3' 

15 3 ' -Sl<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<S2^ S3-5' 

4. Step 3 - Incorporation of double-stranded DNA into a vector 
The double-Stranded DNA is cleaved with restriction enzymes El 
and E3 and subsequently ligated into a vector ( e.g. , pUCI 9 or a yeast 
vector) that has been cleaved with the same enzymes El and E3. The 
20 ligation product is used to transform competent host ceils compatible 
with the vector being used ( e.g.. when pUC19 is used, bacterial cells 
such as E. coli DH5a are suitable hosts) which are then plated onto 
selection plates. Recombinants can be identified either by color (e.g., by 
X-gal staining for vff-galactosidase expression) or by colony hybridization 
25 using ^^P-labeled oligonucleotide 2 (detection by hybridization to 

oligonucleotide 2 is preferred because its sequence is removed in each of 
the subsequent extension steps and thus is present only in recombinants 
thaXcpntain DNA tha^ 
repeated sequenc ). 
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5. Step 4 - isolation of insert from the plasmid ^ 

An aliquot of the recombinant plasmid containing k nucleotides of 
the repeat sequence is digested with restriction enzymes ET and E3, and 
the insert is isolated on a gel (native polyacrylamide while the insert is 
5 short, but agarose can be used for isolation of longer inserts in 

subsequent steps). A second aliquot of the recombinant plasmid is cut 
with enzymes E2 (treated with Klenow and dNTP to remove the 3'- 
overhang) and E3, and the large fragment (plasmid DNA plus the insert) 
is isolated. 

10 6. Step 5 - Extension of the DNA sequence of k repeats ^ 

The two DMAs (the 81-33 insert fragment and the vector plus 
insert) are ligated, plated to selective plates, and screened for extended 
recombinants as in Step 3. Now the length of the repeat sequence 
between restriction sites is twice that of the repeat sequence in the 
15 previous step, i.e., 2xk. 

1 . Step 6 - Extension of the DNA sequence of 2xk repeats 
^ Steps 4 and 5 are repeated as many times as needed to achieve 
the desired repeat sequence size. In each extension cycle, the repeat 
sequence size doubles, i.e., if m is the number of extension cycles, the 
20 size of the repeat sequence will be k x 2"" nucleotides. 

The following examples are included for illustrative purposes only 
and are not intended to limit the scope of the invention. 

EXAMPLE 1 
General Materials and Methods , 
25 The following materials and methods are exemplary of methods 

that are used in the following Examples and that can be used to prepare 
cell lines containing artificial chromosomes. Other suitable materials and 
methods known to those of skill in the art may used. Modifications of 
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these materials and methods known to those of skill in the art may also 
be employed. 

A. Culture of cell lines, cell fusion, and transfection of cells 

— - - : - — - — Chinese hamster K-20 cells-and mouse -A9 fibroblast ^ 

5 cells were cultured in F-12 medium. EC3/7 [see, U.S. Patent No. 
5,288,625, and deposited at the European Collection of Animal cell 
Culture (ECACC) under accession no. i90051001; see, also Hadlaczky et 
aL naan Proc; Natl. Acad. Sci. U.S.A. 88 :8106-8110 and U.S. 
application Serial No. 08/375,271] and EC3/7C5 [see, U.S. Patent No. 

10 5,288,625 and Praznovszky et aL d 991 ) Proc. Natl- Acad. Sci. U.S.A. 
88:1 1042-1 1046] mouse cell lines, and the KE1-2/4 hybrid cell line were 
maintained in F-12 medium containing 400 A/g/ml G418 {SIGMA, St. 

Louis, MO]. . 

2. TF1 004G1 9 and TF1004G-19C5 mouse cells, 

15 described below, and the 19C5xHa4 hybrid, described below, and its 
sublines were cultured in F-12 medium containing up to 400;/g/ml 
Hygromycin B [Calbiochem]. LPl 1 cells were maintained in F-12 medium 
containing 3-15 //g/ml Puromycin ISIGMA, St. Louis, MO]. 

3. Cotransfection of EC3/7C5 cells with plasmids 

20 IpHI 32, pCHllO available from Pharmiacia, see, also Hall et aL (1983) 
J: Mol. AdpI. Gen. 2:101-109] and with A DNA was conducted using th 
calcium phosphate DNA precipitation method [see, e.g., Chen et aL 
11987) MoL_CelL_BloL 7:2745-2752], using 2-5 //g plasmid DNA and 
20 //g /I phage DNA per 5 x 1 0® recipient cells. 

25 4. Cell fusion 

Mouse and hamster cells were fused using polyethylene glycol 
[Davidson et aL (1976) Som. Cell Genet . 2:1 65-176]. Hybrid cells were 
selected in HAT medium containing 400 /yg/ml Hygromycin -B. 



wo 97/40183, 



PCT/US97/05911 



-75- 

Approximately 2x10^ recipient and 2x10® donor cells were fused 
using polyethylene glycol [Davidson et aL ( 1 976) Som.i Cell Genet. 
2:165-176], Hybrids were selected and maintained in F-12/HAT medium 
[Szybaisky et aL (1962) Natl. Cancer Inst, Monoor. 7:75-891 containing 
5 10% FCS and 400 //g/ml G418. The presence of "parental" 

chromosomes in the hybrid cell lines was verified by in situ hybridization 
with species-specific probes using biotin-labeled human and ^hamster 
genomic DNA, and a mouse long interspersed repetitive DNA 
IpMCPEI.51]. 

10 5- Microcell fusion 

Microceji-mediated transfer of artificial chromosomes from 
EC3/7C5 cells to recipient cells was done according to Saxon et aL 
1(1985) MoL Cell, Biol. 1:140-1461 with the modifications. of Goodfellow 
et aL 1(1 989) Techniques for mammalian genome transfer. In Genome 

15 Analysis a Practical Approach. K.E. Davies, ed., IRL Press, Oxford/ • 
Washington DC. pp. 1-1 7] and Yamada et aL [(1 990) Oncogene 5:1 141- 
1 147]. Briefly, 5x10® EC3/7C5 cells in a T25 flask were treated first 
with 0.05 //g/ml colcemid for 48 hr and then with 10 jjqImX cytochalasin 
B for 30 min. The T25 flasks were centrifuged on edge and the pelleted 

20 microcells were suspended in serum free DME medium. The .microcells 
were filtered through first a 5 micron and then a 3 micron polycarbonate 
filter, treated with 50 //g/ml of phytohemagglutin, and used for 
polyethylene glycol mediated fusion with recipient cells. Selection of 
cells containing the MMCneo was started 48 hours after fusion in 

25 medium containing 400-800 //g/ml G41 8. 

Microcells were also prepared from 1 B3 and GHB42 donor cells as 
follows jn order to be fused with E2D6K cells [a CHO K-20 cell line 
carrying the puromycin N-acetyltransferase gene, i.e. . th puromycin 
resistance gene, under the control of the SV40 arly promot r]. The 
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W^'-:':^'diort6r cells were seeded to achieve 60-75% cbnfluency within 24-36 

hours. After that time, the cells were arrested in mitosis by exposure to 
V colchicine (10 /;g/ml) for 1 2 or 24 hours to induce micronucleation. To 
iL^i, ^promote^micronucleation of GHB42-cells,- the cells, were exposied to ^ _ 
5 hypotonic treatment (10 min at 37°C). After colchicine treatment, or 
iafter colchicine and hypotonic treatment, the cells were grown in 
. a tf>^colchicine-free medium. - 

donor cells were trypsinized and centrifuged and the pellets 
were suspended in a 1:1 Percoll medium and incubated for 30-40 min at 
10 37^C. After the incubation, 1-3 x 10^ cells {60-70% micronucleation 
index) were loaded onto each Percoll gradient (each fusion was 
distributed on 1-2 gradients). The gradients were centrifuged at 19,000 
rpm for 80 min in a Sorvall SS-34 rotor at 34-37^C. After 
: centrifugationv two visible bands of cells were removed, centrifuged at 
1 5 2000 rpm, 1 0 min at 4® C, resuspended and filtered through 8 //m pore 
size nucleopore filters. 

The microcells prepared from the 1B3 and GHB42 cells were fused 
with E2D6K. The E2D6K cells were generated by GaP04 transfection of 
CHO K^20 cells with pCHTV2. Plasmid pCHTV2 contains the puromycin- 
20 resistance genie linked to the SV40 promoter and polyadenylation signal, 
the Saccharomvces cerevisiae URA3 gene, 2.4- and 3.2-kb fragments of 
a Chinese hamster chromosome 2-specific satellite DNA (HC-2 satellite; 
see Fatyol et aK (1 994) Nuc. Acids Res. 22: 3728-3736), two copies of 
the diptheria toxin-A chain gene (one linked to the herpes simplex virus 
25 thymidine kinase (HSV-TK) gene promoter and SV40 polyadenylation 
signal and the other linked to the HSV-TK promoter without a 
polyadenylation signal), the ampicillin-resistance gene and the ColEI 
-origin of replication. Following transfection,"Puromycin=resistant colonies 
were isolated. THe presence of the pCHTV2 plasmid in the E2D6K cell 
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line was confirmed by nucleic acid amplification of DNA isolated from the 
cells. 

The purified microcells were centrifuged as described above and 
resuspended in 2 ml of phytohemagglutinin-P (PHA-P, 1 00 //g/ml). The 
5 microcell suspension was then added to a 60-70% confluent recipient 
culture of E2D6K cells. The preparation was incubated at room 
temperature for 30-40 min to agglutinate the microcells. After the PHA-P 
was removed, the cells were incubated with 1 ml of 50% polyethylene- 
glycol (PEG) for one min. The PEG was removed and the culture was 
10 washed three times with F-1 2 medium without serum. The cells were . 
incubated in non-selective medium for 48-60 hours. After this time, the 
cell culture was trypsinized and plated in F-1 2 medium containing 400 
//g/ml hygromycin B and 10 g/ml puromycin to select against the parental 
cell lines. 

15 Hybrid clones were isolated from the cells that had been cultured 

in selective medium. These clones were then analyzed for expressionfof 
^-galactosidase by the X-gal staining method. Four of five hybrid clones 
analyzed that had been generated by fusion of GHB42 microcells with 
E2D6K cells yielded positive staining results indicating expression of 0- 

20 galactosidase from the lacZ gene contained in the megachromosome 

contributed by the GHB42 cells. Similarly, a hybrid clone that had been 
generated by fusion of 1B3 microcells with E2D6K cells yielded positive 
staining results indicating expression of )ff-galactosidase from the lacZ 
gene contained in the megachromosome contributed by the 1B3 cells, /n 

25 s/tu hybridization analysis of the hybrid clones is also performed to- 
analyze the mouse chromosome content of the mouse-hamster hybrid 
cells. - 
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- B: Chromosome bandlrig^^^^^^^^^^^'^^^^^^^^^^^^^^^^^^^ ^ 

Trypsin G-banding of chromosomes was performed using the 
^ - . method of Wang & Fedoroff [( 1 972) Nature 235:52-54], and the ^ 
_ _-detection^of constitutive heterochromatin with the BS - - - - 

5 method was done according to Sumner [(1 972) Exp. Ceil Res, 75:304- 
3061. For the detection of chromosome replication by 
bromodeoxyuridine [BrdU] iricorporation/ the Fluorescein Plus Giemsa 
[FPG] staining rhethod of Perry & Wolff [(1974) Nature 251:1 ^ 
V was used. 

10 C. Immunolabeliing of chromosomes and in situ hybridization 

Indirect immunofluorescence labelling with human anti-centromere 
serum LU851 [Hadlaczky et aL ( 1986) Exo . Cell Res. 1 67 : 1 - 1 51 , and 
indirect immunofluorescence and in situ hybridization on the same 
preparation were performed as described previously [see, Hadlaczky et 

15 aL (1991) Proc. Natl. Acad, Sci. U S A, gg^Rinfi-filin, see/also U.S. 
application Serial No. 08/375,271]. Immunolabeliing with fluorescein- 
conjugated anti-BrdU monoclonal antibody [Boehringer] was performed 
according to the procedure recommended by the manufacturer, except 
that for treatment of mouse A9 chromosomes, 2 M hydrochloric acid 

20 was used at 37** C for 25 min, and for chromosomes of hybrid cells, 1 M 
hydrochloric acid was used at 37** C for 30 min. 
Dv Scanning electron microscopy 

Preparation of mitotic chromosomes for scanning electron 
microscopy using osmium impregnation was performed ais described 
25 previously [Sumner ( 1 991 ) Chromosoma 1 00 :41 0-41 81.^ The chromo- 
somes were observed with a Hitachi S-800 field emission scanning 
electron microscope operated with an accelerating voltage of 25 kV. 
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E. DNA manipulations, . plasmids and probes 

1. General methods 

All general DN A manipulations were performed by standard 
procedures [see, e.g. . Sambrook et aL (1989) Molecular cloning: A 
5 Laboratory Manual Co\6 S^r\r\g Harbor Laboratory Press, Cold Spring 
Harbor, NY]. The mouse major satellite probe was provided by Dr. J. B. 
Rattner lUniversity of Calgary, Alberta, Canada]. Cloned mouse satellite 
DNA probes [see Wong et aL (1988) NucL Acids Res. 16 :1 1645-1 1661 L 
including the mouse major satellite probe, were gifts from Dr. J. B. 

TO Rattner, University of Calgary. Hamster chromosome painting was done 
with total hamster genomic DNA, and a cloned repetitive sequence 
specific to the centromeric region of chromosome 2 [Fatyol et aL (1 994) 
NucL Acids Res. 22:3728-3736] was also used. Mouse chromosome 
painting was done with a cloned long interspersed repetitive sequence 

15 [pMCPI .SI] specific for the mouse euchromatin. ^ 
For cotransfection and for in situ hybridization, the pCH1 10 y^,. 
galactosidase construct [Pharmacia or Invitrogen], and yici 875 Sam7 
phage DNA [New England Biolabs] were used. 

2. Construction of Piasmid pPuroTel 

20 Piasmid pPuroTel, which carries a Puromycin-resistance gene and i 

cloned 2.5 kb human telomeric sequence [see SEQ ID No. 3], was 
constructed from the pBabe-puro retroviral vector [Morgenstern et aL 
(1990) Nucl. Acids Res. 18 :3587-3596: provided by Dr. L. Szekely 
(Microbiology and Tumorbiology Center, Karolinska Institutet, 
25 Stockholm); see, also Tonghua et aL (1 995) Chin. Med. J. (Beijing, Engl 
Ed.) 108 :653-659; Couto et aL (1994) Infect. Immun. 62 :2375-2378; 
Dunckley et aL ( 1 992) FEBS Lett. 296 : 1 28-34; French et aL ( 1 995) Anal 
Bioch m. 228:354-355; Liu _t aL (1995) Blood 85:1095-1103: 
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. F. Deposited cell lines 
Celllines KE 1-2/4, EC3/7C5,TF1004G 19C5, 19C5xHa4, G3D5 
5 and H1D3 have been deposited in accord with the Budapest Treaty at the 
European Collection of Animal Cell Culture (ECACC) under Accession 
Nos. 96040924, 96040925, 96040926, 96040927, 96040928 and 
96040929, respectively. The cell lines were deposited on April 9, 1 996, 
at the European Collection of Animal Cell Cultures (ECACC) Vaccine 
10 Research and Production Laboratory, Public Health Laboratory Service, 
Centre for Appliced Microbiology and Research, Porton Down, Salisbury, 
Wiltshire SP4 OJG, United Kingdom. The deposits were made in the 
name of Gyula Hadiaczky of H. 6723, SZEGED, SZAMOS U.I .A. IX. 36. 
HUNGARY, who has authorized reference to the deposited cell lines in 
15 this application and who has provided unreserved and irrevocable 

consent to the deposited cell lines being made available to the public in 
accordance with Rule 28(1 )(d) of the European Patent Convention. 

EXANIPIEZ 

Preparation of EC3/7, EC3/7C5 and related cell lines 
20 The EC3/7 cell line is an LMTK mouse cell line that contains the 

neo-centromere. The EC3/7C5 cell line is a single-cell subclone of EC3/7 
that contains the neo-minichromosbme. 
: A.' : EC3/7 Cell line ;'. 

As described in U.S. Patent No. 5,288,625 [see, also Praznovszky 
25 et al. (1991> Proc. Natl. Acad. Sci. U.S.A. 88:1 1042-11046 and 

Hadiaczky et aL (1991) Proc. Natl. Acad. Sci. U.S.A. 88:8106-81 101 de 
novo centromere formation occurs in a transformed mouse LMTK' fibro- 
blast cell line [EC3/71 after cointegratiorj of /I constructs :WCM8 and . 
vlgtWESnep] carrying human and bacterial DNA. 
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By cotransfection of a 14 kb human DNA fragment cloned in A 
[yiCMS] and a dominant marker gene MgtWESneo], a selectable 
centromere linked to a dominant marker gene [neo-centromere] was 
formed jn mouse LMTK cell line EC3/7 [Hadlaczky et ah (1991) Proc. 
5 Natl. Acad. Sci. U.S.A. 88 :8106-8110. see Figure 1]. Integration of the 
heterologous DNA [the A DNA and marker gene-encoding DNA] occurred 
into the short arm of an acrocentric chromosome [chromosome 7 (see/ 
Figure 1B)]> where an amplification process resulted in the formation of 
the new centromere [neo-centromere (see Figure 1C)]. On the dicentric 

10 chromosome (Figure 1C), the newly formed centromere region contains 
all the heterologous DNA (human. A, and bacterial) introduced into the 
cell and an active centromere. 

Having two functionally active centromeres on the same ' 
chromosome causes regular breakages between the centromeres [see, 

15 Figure IE]. The distance between the two centromeres on the dicentricf 
chromosome is estimated to be -10-15 Mb, and the breakage that f 
separates the minichromosome occurred between the two centromeres. 
Such specific chromosome breakages result in the appearance [in 
approximately 10% of the cells] of a chromosome fragment that carries 

20 the ned-centromere [Figure 1 FJ. This chromosome fragment is principally 
composed of human. A, plasmid; and neomycin-resistance gene DNA, but 
it also has some mouse chromosomal DNA. Cytological evidence 
suggests that during the stabilization of the MMCneo, there was an 
inverted, duplication of the chromosome fragment bearing the 

25 neo-centromere. The size of minichromosomes in cell lines containing 

the MMCneo is approximately 20-30 Mb; this finding indicates a two-fold 
increase in size. * 

From th EC3/7 cell line, which contains the dicentric chromosome 
[Figure IE], two sublines [EC3/7C5 and EC3/7C6] were selected by 
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repeated single-cell cloning' In the^e cell lines, the rieo-centromere was 
found exclusively on a small chromosonne [neo-minichromosome] while 
the formerly dicentric chromosome carried detectable amounts of the 
-.. .^ exogenously-derived DNA-sequences but not aR active neo-centrorhere ^ 

5 - [Figure IF" and 1G]^"- ■ .-^--^v /- ^ 

The minichromosomes of cell lines EC3/7C5 and EC3/7C6 are 
similar: No differences are detected in their architectures M either the 
: cytological or molecular level. The minichromosomes were 

indistinguishable by conventional restriction endonuclease mapping or by 
10 long-range mapping using pulsed field electrophoresis and Southern 

hybridization. The cytoskeleton of cells of the EC3/7C6 line showed an 
increased sensitivity to colchicine, so the EC3/7C5 line was used for 
further detailed analysis. 

B- Preparation of the EC3/7C5 and EC3/7C6 ceil lines 
The EC3/7C5 cells, which contain the neo-minichromosome, were 
produced by subcloning the EC3/7 cell line in high concentrations of 
G41 8 [40-fold the lethal dose] for 350 generations. Two single 
cell-derived stable cell lines [EC3/7C5 and EC3/7C6] Were established. 
These cell lines carry the neo-centrohiere on minichromosomes and also 
20 contain the remaining fragment of the dicentric chromosome: Indirect 
immunofluorescence with anti-centromere antibodies and subsequent in 
situ hybridization experiments demonstrated that the minichromosomes 
disrived from the dicentric chromosome. In EC3/7C5 and EC3/7C6 cell 
lines (140 and 128 metaphases, respectively) no intact dicentric 
25 chromosomes were found/and minichromosomes were detected in 
97.2% and 98.1 % of the cells, respectively. The minichromosomes 
have been maintained for over 150 cell generations. They do contain the 
— remaining portion of the formerly dicentric chromosom . 



wo 97/40183 



PCT/US97/05911 



Multiple copies of telomeric DN A sequences were detected in the 
marker centromeric region of the remaining portion of the formerly 
dicentric chromosome by In situ hybridization. This indicates that mouse 
telomeric sequences were coamplified with,the foreign DNA sequences. 
5 These stable minichromosome-carrying cell lines provide direct evidence 
that the extra centromere is functioning and is capable of maintaining the 
minichromosomes [see, U.S. Patent No. 5,288,625]. 

The chromosome breakage in the EC3/7 cells, which separates the 
, neo-centromere from the mouse chromosome, occurred in the G-band 
10 positive "foreign" DNA region. This is supported by the observation of 
traces of A and human DNA sequences at the broken end of the formerly 
dicentric chromosome. Comparing the G-band pattern of the 
chromosome fragment carrying the neo-centromere with that of the 
stable neo-minichromosome, reveals that the neo-minichromosome is an 
15 inverted duplicate of the chromosome fragment that bears the neo- 
centromere. This is also evidenced by the observation that although the 
neo-minichromosome carries only one functional centromere, both ends 
of the minichrompsome are heterochromatic, and mouse satellite DNA 
sequences were found in these heterochromatic regions by in s/tu 
20 hybridization. . 

These two cell lines, EC3/7C5 and EC3/7C6, thus carry a 
selectable mammalian minichromosome [MMCneo] with a centromere 
linked to a dominant marker gene [Hadlaczky et aL (1991) Proc. Natl. 
Acad. Sci. U.S.A. 88 :8106-81 10]. MMCneo is intended to be used as a 
25 vector for minichromosome^mediated gene transfer and has been used as 
model of a minichromosome-based vector system. 

Long range mapping studies of the MMCneo indicated that human 
DNA and the neomycin-resistance gene constructs integrated into the 
mouse chromosome s parately, followed by the amplification of the 
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;7 V chromosome region that contains the exogenous DNA. The MMCneo 
contains abou^ 30-50 copies of the /(CMS and /IgtWESneo DNA in the 
form of approximately 160 kb repeated blocks, which together cover at 

„ ^ [east _a_3J5. Mb regipn._Jn_addition to these,- there are- mouse -telomeric ^ - 

5 sequences [PraznovszkyetaL (1991) Proc. NatL Acad. Sci. U.S.A. 
88:1 1042-1 1046] and any DNA of mouse origin necessary for the 
correct higher-ordered structural organization of chromatids. 

Using a chromosome painting probe mCPE 1 .5 1 [mouse long 
interspersed repeated DNA], which recognizes exclusively euchromatic 
10 mouse DNA, detectable amounts of interspersed repeat sequences were 
found on the MMCneo by fn situ hybridization. The neo-centromere is 
associated with a small but detectable amount of satellite DNA. The 
chromosome breakage that separates the neo-centromere from the 
mouse chromosome occurs in the "foreign" DNA region. This is 
15 demonstrated by the presence of A and human DNA at the broken end of 
the formerly dicentric chromosome. At both ends of the MMCneo, 
however, there are traces of mouse major satellite DNA as evidenced by 
fn s/Yi/ hybridization. This observation suggests that the doubling in size 
of the chromosome fragment carrying the neo-centromere during the 
20 stabilization of the MMCneo is a result of an inverted duplication. 

Although mouse telomere sequences, which coamplified with the - 
exogenous DNA sequences during the neo-centromere formation, may 
provide sufficient telomeres for the MMCneo, the duplication could have 
supplied the functional telomeres for the minichromosome. 
25 The nucleotide sequence of portions of the neo-minichromosomes 

was determined as follows. Total DNA was isolated from EC3/7C5 cells 
according to standard procedures. The DNA was subjected to nucleic 
acid amplification using the Expand Long Template PCR syst m 
[Boehringer Mannheim] according to the manufactur r's procedures. The 
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amplification procedure required only a single 33-nner oligonucleotide 
primer corresponding to sequence in a region of the phage >t right arm, 
which is contained in the neo-minichromosome. The sequence of this 
oligonucleotide is set forth as the first 33 nucleotides of SEQ ID No. 13. 
5 Because the neo-minichrompsome contains a series of inverted repeats of 
this sequence, the single oligonucleotide was used as a forward and 
reverse primer resulting in amplification of DNA positioned between sets 
of inverted repeats of the phage DNA. Three products were obtained 
from the single amplification reaction, which suggests that the sequence 

10 of the DNA located between different sets of inverted repeats may differ. 
In a repeating nucleic acid unit within an artificial chromosome, minor 
differences may be present and may occur during culturing of cells 
containing the artificial chromosome. For example, base pair changes 
may occur as well as integration of mobile genetic elements and 

15 deletions of repeated sequences. 

Each of the three products was subjected to DNA sequence ^ 
analysis. The sequences of the three products are set forth in SEQ ID 
Nos- 13, 14, and 15, respectively. To be certain that the sequenced 
products were amplified from the neo-minichromosome, control 

20 amplifications were conducted using the same primers on DNA isolated 
from negative control celMines {mouse Ltk* cells) lacking V 
rninichromosomes and the formerly dicentric chromosome, and positive 
control cell lines [the mouse-hamster hybrid cell line GB43 generated by 
treating 19C5xHa4 cells (see Figure 4) with BrdU followed by growth in 

25 G41 8-containing selective medium and retreatment with BrdU] containing 
the neo-minichromosome only. Only the positive control cell line yielded 
the three amplification products; no amplification product was detected 
in the negative control reaction. The, results obtained in the positive 
control amplification also demonstrate that the neo-ininichromosome 
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dNA; and not the fragment of the formerly dicentric mduse chrom ^ - 

• was amplified:/ ■^•''■^'^''■■^'.'^■--^^^^^ ' 

- / ^ ; ^ rrhe sequences of the three amplification products were compared / - 

^ ^ ^ tOL those contained^in the^Genbank/EMBL database.- SEQJD-Nos. 13 and- -r-^ - 

5 14 showed high ( - 96%) homology to portions of DNA from 

intracisternal A-particles from mouse. SEQ ID No. 15 showed no . 
significant homology with sequences available in the databaseV Al| three 
of these sequences may be used for generating gene targeting vectors as 
homologous DNAs to the neo-minichromosome. 
10 C. Isolation and partial purification of minichromosomes 

Mitotic chronnosomes of ^ 
by Hadlaczkv et al. [(1981) Chromosoma 81 :537-5551. using a 
glycine-hexylene glycol buffer system [Hadlaczky et aK (1 982) 
Chromosoma 86 :643-6591. Chromosome suspensions were centrifuged 
15 at 1,200 X g for 30 minutes. The supernatant containing 

minichromosomes was centrifuged at 5,000 x g for 30 minutes and the 
pellet was resuspended in the appropriate buffer. Partially purified 
minichrorhosbmes were stored in 50% glycerol at -20° C. 

D. Stability of the MMCneo maintenance and /7eo expression 
20 _ EC3/7e5 cells grown in non-selective medium fbr 284 days and 

then transferred to selective medium containing 400 //g/ml G418 showed 
a 96% plating efficiency (colony formation) compared to control cell^ 
cultured permanently in the presence of G41 8. Cytogenetic analysis 
indicated that the MMCneo is stably maintained at one copy per cell 
25 under selective and non-selective culture conditions. Only two 

metaphases with two MMCneo were found in 2,270 metaphases : 
■/■analyzed.- 

— - Southern hybridiza^^^^ 

DNA restriction patterns, and similar hybridization intensities wef 
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observed with a neo probe when DNA from cells grown under selective 
or non-selective culture conditions were compared. 

Northern analysis of RNA transcripts from the nea gene isolated 
from cells grown under selective and non-selective conditions showed 
5 only minor and not significant differences. Expression of the /leo gene 
persisted in EC3/7C5 cells maintained in F-1 2 medium free of G418 for 
290 days under non-selective culture conditions. The long-term 
expression of the neo gene{s) from the minichromosome may be 
influenced by the nuclear location of the MMCneo. //? s/fa hybridization 
10 experiments revealed a preferential peripheral location of the MMCneo in 
the interphase nucleus. In more than 60% of the 2,500 nuclei analyses, 
the minichromosome was observed at the perimeter of the nucleus near 
the nuclear envelope. 

EXAMPLE 3 

15 Minichromosome transfer and production of the >l-neo-chromosome 

A, Minichromosome transfer ^ 
The neo-minichromosome [referred to as MMCneo/FIG. 2C] has 
been used for gene transfer by fusion of minichromosome-containing 
cells [EC3/7C5 or EC3/7C6] with different mammalian cells, including 

20 hamster and human. Thirty-seven stable hybrid ceil lines have been < 
produced. All established hybrid cell lines proved to be true hybrids as 
evidenced by in situ hybridization using biotinylated human, and hamster 
genomic, or pMGPE1.51 mouse long interspersed repeated DNA probes 
for "chromosome painting". The MMGneo has also been successfully 

25 transferred into mouse A9, L929 and pluripotent F9 teratocarcinoma cells 
by fusion of microcells derived from EC3/7G5 cells. Transfer was 
confirmed by PGR, Southern blotting and ir? s/Ya hybridization with 
minichromosQme-specific probes. The cytogenetic analysis confirmed 
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, : ; t^^^^ as expected for microedl fusion, a ^fe^^^ cells [1-5%]: received [or 
retained] the MMCneo. . ' 

^^y^^i^r::- ..,:^These,resuhs dernonstrate that the MMCneo is tolerated by a wide 
_^ - J'an9e_of_cells.^ The prpkaryotic. genes and-the extra-dosage-for-the - - - 
5 human and A sequences carried on the minichromosome seem to be not 
disadvantageous for tissue culture cells. 

The MMCneo is the smallest chromosomis of the EC3/7C5 genome 
and is estimated to be approximately 20-30 Mb, which is significantly 
smaller than the majority of the host cell (mouse) chromosomes. By 
>10 virtue of the smaller size, minichromosomes can be partially purified from 
a suspension of isolated chromosomes by a simple differential 
V centrifugation. In this way, minichromosome suspensions of 1 5-20% 
purity have been prepared. These enriched minichromosome 
preparations can be used to introduce, such as by microinjection or 
15 lipofection, the minichromosome into selected target cells. Target cells 
include therapeutic cells that can be use in methods of gene therapy, and 

also embryonic cells for the preparation of transgenic (non-human) 
animals. 

The MMCneo is capable of autonomous replication, is stably 
20 maintained in cells, and permits persistent expression of the neo gene{s), 
even after long-term culturing under non-selective conditions. It is a 
non-integraiive vector that appears to occupy a territory near the nuclear 
envelope. Its peripheral localization in the nucleus may have an 
important role in maintaining the functional integrity and stability of the 
25 MMCneo. Functional compartmentalization of the host nucleus may have 
an effect on the function of foreign sequences. In addition^ MMCneo 
contains megabases oi A DNA sequences that should serve as a target 
- - site .for homologous- recombination^and thus-integration of desir d- - 
gene(s) into the MMCneo. It can be transferred by cell and microcell 
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fusfion,. microinjection, electroporation, lipid-mediated carrier systems or 
chromosome uptake. The neo-centromere of the MMCneo is capable of 
maintaining and supporting the normal segregation of a larger 1 50-200 
Mb ylneo-chromosome. This result demonstrates that the MMCneo 
-5 chromosome should be useful for carrying large fragments of 
heterologous DNA. ^ , ^ . 

B. Production of the ^neo-chromosome 
In the hybrid ceil line KEl-2/4 made by fusion of EC3/7 and 
Chinese hamster ovary cells [FIG 2], the separation of the neo- 

10 centromere from the dicentric chromosome was associated with a further 
amplification process. This amplification resulted in the formation of a 
stable chromosome of average size [ i.e. . the ylneo-chromosome; see, 
Praznovszky et aL ( 1 99 1 ) Proc. Natl. Acad. Sci. U.S.A. 88 : 11 042- 
,11046]. The /Ineo-chromosome carries a terminally located functional 

15 centromere and is composed of seven large ampltcons containing multiple 
copies of y*, human, bacterial, and mouse DNA sequences [see FIG 2].4^ 
The amplicons are separated by mouse major satellite DNA [Praznovszky 
et aL (1991) Proc. Natl. Acad. Sci. U.S.A. 88:1 1042-1 10461 which 
. forms narrow bands of constitutive heterochromatin between the 

20 amplicons. 

EXAMPLE 4 
Formation of the "sausage chromosome" [SC] 

The findings set forth in the above EXAMPLES demonstrate that 
the centromeric region of the mouse chromosome 7 has the capacity for 
25 large-scale amplification [other results indicate that this capacity is not 
unique to chromosome 7]-: This conclusion is further supported by 
results from cotransfection experiments, in which a second dominant 
selectable mark r gene and a non-selected marker gene were introduced 
into EC3/7C5 cells carrying the formerly dicentric chromosome 7 and the 
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"^^'"'^ the EC3/7G5 cell line was transformed with 

phage DN A, a hygromycin-resistance gene construct tpH 1 32] , and a 0- 
: galactpsidase gene construct [pCH 11 OJ. Stable transformants were 

_sejected jnjhe_p_resence_of high_concentratlons_[400-//g/rn!] Hygromycin- 
5 B, and analyzed by Southern hybridization. Established transformant cell 
lines showing multiple copies of integrated exogenous DNA were studied 
by in s/tu hybridization to localize the integration site(s), and by LacZ 
staining to detect yff-galactosidase expression. 
A. Materials and methods 
10 1. Construction of pH 132 

The pH132 plasmid carries the hygromycin B resistance gene and 
the anti-HIV-1 yagr ribozyme [see, SEQ ID NO. 6 for DNA sequence that 
corresponds to the sequence of the ribozyme] under control of the /5 
actin promoter. This plasmid was constructed from pHyg plasmid 
15 [Sugden gt aL d 985) Mol. Cell. Biol. 5:41 0-41 3; a gift from Dr. A. D. 
Riggs, Beckman Research Institute, Duarte; see, also/ e.g. . U.S. Patent 
No. 4,997,7641, and from pPC-RAGI 2 plasmid [see, Chang et aL (1990) 
Clin Biotech 2:23-31; provided by Dr. J. J. Rossi, Beckman Research 
Institute, Duarte; see, also U.S. Patent Nos. 5,272,262, 5,149,796 and 
20 5,144,019, which describes the anti-HIV gag ribozyme and construction 
of a mammalian expression vector containing the ribozyme insert linked 
to the^-actjn promoter and SV40 late gene- transcriptionar termination 
and poiy A signals]. Construction of pPC-RAGI 2 involved insertion of the 
ribozyme insert flanked by Bam HI linkers was into BamHI-digested pH^ff- 
25 Apr-lgpt [see. Gunning et aL (1987) Proc. Natl. Acad. Sci. U.S. A: 
84:4831-4835, see, also U.S. Patent No. 5,144,019]. 

Plasmid pHI 32 was constructed as follows. First, pPC-RAG12 
Jdescribed by Chang et ai- (1 990) £iin^_Biotecii^ 2:23=31 J was digested 
with BamHI to xcise a fragment containing an antl-HIV ribozyme gene 
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[referred to as ribozyme D by Chang et aL [(1990) Clin. Biotech. 2:23- 
311; see also U.S. Patent No. 5,144,019 to Rossi et aL. ; particularly 
Figure 4 of the patent] flanked by the human ^S-actin promoter at the 5' 
. end of the gene and the SV40 late transcriptional termination and 
5 polyadenylation signals at the 3' end of the gene. As described by 

Chang et aL [(1990) Clin. Biotech. 2:23-31 J/ribozyme D is targeted for 
cleavage of the transiational initiation region of the HIV gag gene. This 
fragment of pPC-RAGI 2 was subcloned into pBluescript-KS( + ) 
[Stratagene, La Jolla, CA] to produce plasmid 132. Plasmid 132 was 

10 then digested with Xho l and Eco RI to yield a fragment containing the 
ribozyme D gene flanked by the )ff-actin promoter at the 5' end and the 
SV40 termination and polyadenylation signals at the 3' end of the gene. 
This fragment "was ligated to the largest fragment generated by digestion 
of pHyg [Sugden et al. (1985) MoL Cell. Biol. 5:410-413] with Eco RI and 

15 Sai l to yield pH132. Thus, pH132 is an —9.3 kb plasmid containing the 
following elements: the )ff-actin promoter linked to an anti-HIV ribozymei 
gene followed by the SV40 termination and polyadenylation signals, the 
thymidine kinase gene promoter linked to the hygromycin-resistance gene 
followed by the thymidine kinase gene polyadenylation signal, and the E. 

20 coll ColEI origin of replication and the ampicillin-resistance gene. ^ 
The plasmid pHyg [see, e.g. . U.S; Patent Nos. 4,997,764, 
4,686,186 and 5,162,215], which confers resistance to hygromycin B 
using transcriptional controls from the HSV-1 tk gene, was originally 
constructed from pKan2 [Yates et aL (1984) Proc. Natl. Acad. Sci. 

25 U.S.A. 81:3806-3810] and pLG89 (see, Gritz et aL (1983) Gene ^ 

2§:179-188]; Briefly pKan2 was digested with Sma l and Bqlll to remove 
the sequences derived from transposon Tn5. The hygromycin-resistance 
hph gen was inserted into the digested pKan2 using blunt-end ligation 
at the Sna l sit and "sticky-end" ligation [using 1 Weiss unit of T4 DNA 
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. ligase (BRL) in 20 microliter volume] at the Mil site: The Srnal and Bgill 
sites of pKan2 were lost during ligation. ' 

:The resulting plasmid pHI 32,, produced from introduction of the 

_ „ :^i}*i:'j!^y^^[i''o?yjT?e_cons^^^ 

5 includes the anti-HIV ribozyme under control of the /?-actin promoter as 
well as the hygromycin-resistance gene under control of the TK 
promoter. 

,/ 2. Chromosome banding V 

Trypsin G-banding of chromosomes was performed as described in 
10 EXAMPLE 1. 

3. Cellcuitures 

TF1 004G 19 and TF1004G-19C5 mouse cells and the 19C5xHa4 
hybrid, described below, and its sublines were cultured in F-12 medium 

containing 400 jt/g/ml Hygromycin B [Calbiochem]. 
15 B. Cotransfection of EC3/7C5 to produce TF1004G19 

Cotransfection of EC3/7C5 cells with plasmids [pH132, pCH1 10 

available from Pharmacia, see. also Hall et aL (1 983) J. Mol. Add!. Gfin 

2:101-109] and with >t DNA Wcl 875 Sam 7(New England Biolabs)] was 

conducted using the calcium phosphate DNA precipitation method [see, 
20 a:£L, Chen gt aL (1987) Mol. Cell. Biol. 7:2745-2752], using 2-5 //g 

plasmid DNA and 20 //g A phage DNA per 5 x 10« recipient cells. 

C. Cell lines containing the sausage chromosome 

Analysis of one of the transformants, designated TF1004G19, 

revealed that it has a high copy number of integrated pHI 32 and 
25 pCH1 10 sequences, and a high level of yff-galactosidase expression. G- 

banding and />? s/fi/ hybridization with a human probe [CMS; see, e.g. . 

U.S. application Serial No. 08/375,271] revealed unexpectedly that 
. 'ntegration bad.occurred in the. formerly dicentric chromosome 7 of the- 

EC3/7C5 ceil line. Furthermore, this chromosome carried a newly formed 
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heterochromatic chromosome arm. The size of this heterochromatic arm 
varied between —150 and —800 Mb in individual metaphases; 

By single cell cloning from the TF1004G19 cell line, a subclone 
TF1004G-19C5 [FIG 2D1, which carries a stable chromosome 7 with a 
5 — 100-1 50 Mb heterochromatic arm [the sausage chromosome] was 
obtained. This cell line has been deposited in the ECACC under 
Accession No. 96040926. This chromosome arm is composed of four to 
five satellite segments rich in satellite DNA, and evenly spaced integrated 
heterologous "foreign" DNA sequences. At the end of the compact 
10 heterochromatic arm of the sausage chromosome, a less condensed 

euchromatic terminal segment is regularly observed. This subclone was 
used for further analyses. 

D. ' Demonstration that the sausage chromosome is derived from the 
formerly dicentric chromosome 

15 //? s/fi/ hybridization with /I phage and pH 132 DNA on the 

TF1004G-19C5 cell line showed positive hybridization only on the 
nriinichromosome and on the heterochromatic arm of the "sausage" 
chromosome [Fig. 2D1. It appears that the "sausage" chromosome 
[herein also referred to as the SC] developed from the formerly dicentric 

20 chromosome (FD) of the EC3/7C5 cell line. 

To establish this, the integration sites of pCH1 10 and pH132 
plasmids were determined. This was accomplished by in situ 
hybridization on these cells with biotin-labeled subfragments of the 
hygromycin-resistance gene and the )S-galactosidase gene. Both 

25 experiments resulted in narrow hybridizing bands on the heterochromatic 
arm of the sausage chromosome. The same hybridization pattern was 
detected on the sausage chromosome using a mixture of biotin-labeled /I 
probe and pH132 plasmid, proving the cointegration of A phages, pH132 
and pCH1 10 plasmids. 
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" To examine this further; the cells were cultured in^ t^^^ presence of 
the DNA-binding dye Hoechst 33258. Culturing of mouse cells in the 
presence of this dye results in under-condensation of the pericentric 
^ ^ heterochrom^ I 

5 observation of the hybridization pattern. Using this technique, the 

heterochromatic arm of the sausage chromosome of TF1004G-1 9C5 cells 
showed regular under-condensation revealing the details of the structure 
of the "sausage" chromosome by in s/tu hybridization. Results of in situ 
hybridization on Hoechst-treated TF1004G-19C5 cells with biotin-labeled 

10 subfragments of hygromycin-resistance and )ff-galactosidase genes shows 
that these genes are localized only in the heterochromatic arm of the 
sausage chromosome. In addition, an equal banding hybridization pattern 
was observed. This pattern of repeating units [amplicons] clearly 
indicates that the sausage chromosome was formed by an amplification 

15 process and that the >i phage, pH 132 and pCHITO plasmid DNA 
sequences border the amplicons. 

In another series of experiments using fluorescence />7 s/f£y 
hybridization [FISH] carried out with mouse major satellite DNA/ the main 
component of the mouse pericentric heterochromatin, the results 

20 confirmed that the amplicons of the sausage chromosome are primarily : 
composed of satellite DNA. 

E. The sausage chromosome has one centromere 

To determine whether mouse centromeric sequences had 
participated in the amplification process forming the "sausage" 
25 chromosome and whether or not the amplicons carry inactive 

centromeres, in situ hybridization was carried out with mouse minor 
satellite DNA. Mouse minor satellite DNA is localized specifically near 
the centromeres of all mouse chromosomes. Positive hybridization was 
detected in all mouse centromeres including the sausage chromosome. 
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which^ however, only showed a positive signal at the beginning of the 
heterochromatic arm. » 

Indirect immunofluorescence with a human anti-centrohnere 
antibody [LU 851] which recognizes only functional centromeres [see, 
5 e.g. . Hadlaczky et aL (1 989) Chromosoma 97:282-2881 proved that the 
sausage chromosome has only one active centromere. The centromere 
comes from the formerly dicentric part of the chromosome and co- 
localizes with the in situ hybridization signal of the mouse minor DNA 
probe. 

10 The selected and non-selected heterologous DNA in the 

heterochromatin of the sausage chromosome is expressed 

1. High levels of the heterologous genes are expressed 

The TF1004G-19C5 ceil line thus carries multiple copies of 
hygromycin-resistance and )ff-galactosidase genes localized only in the 
15 heterochromatic arm of the sausage chromosome. The TF1004G-19C5,; 
cells can grow very well in the presence of 200 //g/ml or even 400 //g/mlV 
hygromycin B. [The level of expression was determined by Northern 
hybridization with a subfragment of the hygromycin-resistance gene and 
single copy gene.] 

20 The expression of the non-selected )ff-galactosidase gene in the 

TF1004G-1 9C5 transformant was detected with LacZ staining of the 
cells. By this method one hundred percent of the cells stained dark blue, 
showing that there is a high level of ^ff-galactosidase expression in all of 
TF1004G-19C5 cells. 

25 2. The heterologous genes that are expressed are in the 

heterochromatin of the sausage chromosome 

To demonstrate that the genes localized in the constitutive 

heterochromatin of the sausage chromosome provide the hygromycin 

resistance and the LacZ staining capability of TF1004G-19C5 

30 transformants fi.e. )g-gal expression], PEG-induced celf fusion betw en 
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;^ 3Hr. TPi004G-19C5 mouse cells and Chinese hamster ovary cells was 

performed. The hybrids were selected and maintained in HAT medium 
containing G41 8 [400 /yg/ml] and hygromycin [20G //g/ml]. Two hybrid 

. - - . clon_es .d.esignated- .1.9C5xHa3. and- 1 9C5xHa4; which have been - 

5 deposited in the ECACC under Accession No. 96040927, were selected. 
; r Both carry the sausage chromosome and the minichromosome. 

■ Twenty-seven single cell.derived colonies of the 19C5xHa4 hybrid 
were maintained and analyzed as individual subclones. In situ 
hybridization with hamster and mouse chromosome painting probes and 
10 hamster chromosome 2-specific probes verified that the 19C5xHa4 clone 
contains the complete Chinese hamster genome and a partial mouse 
genome. All 1 9C5xHa4 subclones retained the hamster genome, but 
i^:^}^^^^^ ^ different numbers of mouse chromosomes 

indicating the preferential elimination of mouse chromosomes. 
15 To pronnote further elimination of mouse chromosomes, hybrid 

r. cells were repeatedly treated with BrdU. The BrdU treatments, which 
destabilize the genome, result in significant loss of mouse chromosomes 
The BrdU-treated 19C5xHa4 hybrid cells were divided to three groups. 
One group of the hybrid cells (GH) were maintained in the presence of 
20 : hygromycin (200 //g/ml) and G418 (400 //g/ml), and the other two 
groups of the cells were cultured under G418 (G) or hygromycin (H) 
selection conditions to promote the elimination of the sausage 
chromosome or minichromosome. 

One month later, single cell derived subclones were established 
!5 from these three subcultures of the 1 9C5xHa4 hybrid line. The 

subclones were monitored by in situ hybridization with biotin-labeled A 
phage and hamster chromosome painting probes. Four individual clones 
. ; ■ rG2B5,-G3C5,- G4D6, G2B4] selected in the presence of G41 8 that had 
lost the sausage chromosome but retained the minichromosome were 
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found. Under hygromycin selection only one subclone [y^ lOl: ^ -r 
nninichromospme. !n "hi^^ c; tr-rf rr^crgacnromosonrie [see, Example 5] , 
was presenc. . . . 

Since hygromycin-resistance and )?-galactosidase genes were 

, 5 thought to be expressed from the sausage chromosome, the expression 
of these genes was analyzed in the four subclones that had lost the 
sausage chromosome. In the presence of 200 //g/ml hygromycin, one 
hundred percent of the cells of four individual subclones died. In order to 
detect the yff-galactosidase expression hybrid, subclones were analyzed 

10 by LacZ staining. One hundred percent of the cells of the four subclon s 
that lost the sausage chromosome also lost the LacZ staining capability. 
All of the other hybrid subclones that had not lost the sausage 
chromosome under the non-selective culture conditions showed positive 
LacZ staining. 

15 These findings demonstrate that the expression of hygromycin- 

resistance and )ff-galactosidase genes is linked to the presence of the^^ 
sausage chromosome. Results of in situ hybridizations show that the 
heterologous DNA is expressed from the constitutive heterochromatin of 
the sausage chromosome. 

20 //7 situ hybridization studies of three other hybrid subclones [G2C6, 

G2DT, and G4D5] did not detect the presence of the sausage 
chromosome. By the LacZ staining method, some stained cells were 
detected in these hybrid lines, and when these subclones were 
transferred to hygromycin selection some colonies survived. Cytological 

25 analysis and in situ hybridization of these hygromycin-resistant colonies 
revealed the presence of the sausage chromosome, suggesting that only 
the cells of G2C6, G2D1 and G4D5 hybrids that had not lost the sausage 
chromosome were able to preserve the hygromycin resistance and 0- 
galactosidase expression. These results confirmed that the expression of 
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these gahes is linked to the present Wwimage chromosom The 

evel o,^.ga,ac,osidase expressiOh was de,e.n,ined by the immnnoblot 
technique using a monoclonal antibody. : ~ ' 

^ ^Hygromycin resistanca^and ^g^lactosidase eipf^ion o? the cells 
wh.ch contained the sausage chromosome were provided by the genes 
localized in the mouse pericentric heterochromatin. This was 

demonstrated by performing Southern DNA hybridizations on the hybrid 
cells that lack the sausage chromosome using PCR-amplified 
subfragments of hygromycin-resistance and ^-galactosidase genes as 
probes. None of the subclones showed hybridization with these probes- 
however, all of the analyzed clones contained the minichromosome 
Other hybrid clones that contain the sausage chromosome showed 
intense hybridization with these DNA probes. These results lead to the 
conclusion that hygromycin resistance and /^-galactosidase expression of 
the cells that contain the sausage chromosome were provided by the 
genes localized in the mouse pericentric heterochromatin. 

EXAMPLE 5 

The gigachromosome 

As described In Example 4, the sausage chromosome was 
transferred into Chinese hamster cells by cell fusion. Using Hygromycin 
B/HAT and G41 8 selection, two hybrid clones 19C5xHa3 and r9C5xHa4 

were produced that carry the sausage chromosome: hybridization 
using hamster and mouse chromosome-painting probes and a hamster 
chromosome 2.specific probe, verified that clone 19C5xHa4 contains a 
complete Chinese hamster genome as well as partial mouse genomes 
Twenty-seven separate colonies of 1 9C5xHa4 cells were maintained and 
analyzed as individual subclones. Twenty-six out of 27 subclones 
contained-a morphologically unchahged s^sagrahromosome?' ' " 
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In one subclone of the 19C5xHa3 cell line, 1 9C5xHa47 [see Fig. 
2E], the heterochromatic arm of the sausage chromosome became 
unstable and showed continuous intrachromosomal growth. In extreme 
cases, the amplified chromosome arm exceeded 1000 Mb in size 
5 (gigachromosome). 

EXAMPLE 6 

The stable megachromosome 

A. Generation of cell lines containing the megachromosome 

All 1 9C5xHa4 subclones retained a complete hamster genome, but 

10 different subclones showed different numbers of mouse chromosomes, 
indicating the preferential elimination of mouse chromosomes. As 
described in Example 4, to promote further elimination of mouse 
chromosomes, hybrid cells were treated with BrdU, cultured under G418 
(G) or hygromycin (H) selection conditions followed by repeated 

15 treatment with 10*^ M BrdU for 16 hours and single cell subclones were 
established. The BrdU treatments appeared to destabilize the genorrie, 
resulting in a change in the sausage chromosome as well. A gradual 
increase in a cell population in which a further amplification had occurred 
was observed. In addition to the —100-150 Mb heterochromatic arm of 

20 the sausage chromosome, an extra centromere and a — 1 50-250 Mb 
heterochromatic chromosome arm were formed, which differed from 
those of mouse chromosome 7. By the acquisition of another 
euchromatic terminal segment, a new submetacentric chromosome 
(megachromosome) was formed. Seventy-nine individual subclones were 

25 established from these BrdU-treated cultures by single-cell cloning: 42 
subclones carried the intact megachromosome, 5 subclones carried the 
sausage chromosome, and in 32 subclones fragments or translocated 
segments of the megachromosome were observed. Twenty-six 
subclones that carried the megachromosome were cultured under non- 
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: selective conaiTibns over a twb-month period; In 1 9 o^^^ 
subclones, the megachromosome was retained. Those subclones which 
lost the megachromosomes all became sensitive to HygromVcin B and " 
_had no.^.galactosidase expressionrindicating that both-markeFs were ~ ' 
linked to the megachromosome. 

Two sublines (G3D5 and HI D3), which were chosen for further 
experiments, showed no changes in the morphology of the 
megachromosome during more than 100 generations under selective 
conditions. The G3D5 cells had been obtained by growth of 19C5xHa4 
cells in G4T8-containing medium followed by repeated BrdU treatment, 
whereas H IDS cells had been obtained by culturing 19C5xHa4 cells in 
hygromycin-containing medium followed by repeated BrdU treatment. 
B. . Structure of the megachromosome 

■ The following results demonstrate that, apart from the euchromatic 

terminal segments, the integrated foreign DMA (and as in the exemplified 

embodiments, rDN A sequence), the whole megachromosome is 
constitutive heterochromatin, containing a tandem array of at least 40 
[-7.5 Mb] blocks of mouse major satellite DMA [see Figures 2 and 3J. 
Four satellite DNA blocks are organized into a giant palindrome 
[ampliconj carrying integrated exogenous DNA sequences at each end. 
The long and short arms of the submetacentric megachromosome 

^ ^ amplicons, respectively. It is of course understood 
that the specific organization and size of each component can vary 
among species, and also the chromosome in which the amplification 
event initiates. 

1. The megachromosome is composed primarily of 
heterochromatin 

. : : ^.-^.^9?P^_fo';.l'ie_tejminal regions and.the integrat d-foreign DNA>-ther 
megachromosome is composed primarily of het rochromatin. This was 
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demonstrated by C-banding of the megachromosome, which resulted in 
positive staining characteristic of constitutive heterochromatiri; Apart 
from the terminal regions and the integrated foreign DNA, the whole 
megachromosome appears to be heterochromatie. Mouse major satellite 
5 DNA is the main component of the pericentric, constitutive 

heterochromatin of mouse chromosomes and represents — 10% of the 
total DNA [Waring et aL (1966) Science 154 :791-7941. Using a mouse 
■ major satellite DNA probe for //? s/Ya hybridization, strong hybridization 
was observed throughout the megachromosome, except for its terminal 

10 regions. The hybridization showed a segmented pattern: four large 

blocks appeared on the short arm and usually 4-7 blocks were seen on 
the long arm.. By comparing these segments with the pericentric regions 
of normal mouse chrorhosomes that carry — 1 5 Mb of major satellite 
DNA, the size of the blocks of major satellite DNA on the 

15 megachromosome was estimated to be --30 Mb, 

Using a mouse probe specific to euchromatin fpMePET.51; a 
mouse long interspersed repeated DNA probe], positive hybridization was 
detected only on the terminal segments of the megachromosome of the 
H1D3 hybrid subline. In the G3D5 hybrids, hybridization with a hamster- 

20 specific probe revealed that several megachromosomes contained 

terminal segments of hamster origin on the long arm. This observation 
indicated that the acquisition of the terminal segments on these 
chromosomes happened in the hybrid cells, and that the long arm of the 
megachromosome was the recently formed one arm. When a mouse 

25 minor satellite probe was used, specific to the centromeres of mouse 

chromosomes [Wong et al. (1988V NucL Acids Res. 1 6:1 1 645-1 1 661 ], a 
strong hybridization signal was detected only at the primary constriction 
of the megachromosome, which colocalized with the positive 
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! =«na, produced WW human anii-cenuom«re . arum; 

P^besrevealedtt^ta^ 

between the mouse major satellite DNA segments. Each segment of 
mouse major satellite DNA was bordered by a narrow band of integrated 
heterologous DNA. except at the second segment of the long arm where 

a double band of heterologous DNA existed, indicating that the major 
,„ """"^ =«9";«n. was missing or considerably reduced in size here 

10 Th,s chromosome region served as a useful cytological marker in 
/dentifving the long arm of the megachromosome. A, a frequency of 
10 , -restoration" of these missing satellite DNA blocks was observed in 
, one chromatid, when the formation of a whole segment on Cne 
chromatid occurred. 

15 After Hoechst 33258 treatment /50v/g/ml for 16 hours), the 

megachromosome showed undercondensation throughout its length 
except for the terminal segments. This made it possible to study the 
^architecture of the megachromosome at higher resolution. »,^/r^ 
hybridization with the mouse major satellite probe on undercondensed 
megachromosomes demonstrated that the -30 Mb major satellite 
segments were composed of four blocks of -7.5 Mb separated from 
each other by.a narrow band of non-hybridizing sequences, [Figure 3] 
S,m.lar segmentation can be observed in the large block of pericentric 
^ heterochromatin in metacentric mouse chromosomes from the LMTK" and 
25 A9 cell lines. 

^ ^^O'nposed of segments containing 

blocks ~ ^^-^^ ^""^'^^^ inverted 

— Because of the asyrtirfie^ in th VOTlfdine conient be^^ " ' 

30 strands of the DNA of the mouse major satellite, when mouse cells are 
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grown in the presence of BrdU for a single S phase, the constitutive 
heterochromatin shows lateral asymmetry after FPG staining. Also, in 
the 19C5xHa4 hybrids, the thymidine-kinase [Tk] deficiency of the 
mouse fibroblast cells was complemented by the harnster Tk gene, 
5 permitting BrdU incorporation experiments. 

A striking structural regularity in the megachromosome was 
detected using the FPG technique. In both chromatids, alternating dark 
and light staining that produced a checkered appearance of the 
megachromosome was observed. A similar picture was obtained by 

10 labelling with fluorescein-conjugated anti-BrdU antibody. Comparing 
these pictures to the segmented appearance of the megachromosome 
showed that one dark and one light FPG band corresponded to one —30 
Mb segment of the megachromosome. These results suggest that the 
two halves of the — 30 Mb segment have an inverted orientation. This 

15 was verified by combining in situ hybridization and immunolabelling of 
the incorporated BrdU with fluorescein-conjugated anti-BrdU antibody on 
the same chromosome. Since the —30 Mb segments [or amplicons] of 
the megachromosome are composed of four blocks of mouse major 
satellite DNA, it can be concluded that two tandem —7.5 Mb blocks are 

20 followed by two inverted blocks within one segment. 

Large-scale mapping of megachromosome DNA by pulsed-field 
electrophoresis and Southern hybridization with "foreign" DNA probes 
revealed a simple pattern of restriction fragments. Using endonucleases 
with none, or only a single cleavage site in the integrated foreign DNA 

25 sequences, followed by hybridization with a hyg probe, 1-4 predominant 
fragments were detected. Since the megachromosome contains 10-12 
amplicons with an estimated 3-8 copies of hyg sequences per amplicon 
(30-90 copies per megachromosome), the small number of hybridizing 
fragments indicates the homogeneity of DNA in the amplified segments. 
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3. 



Scanning electron microscopy of the megachromosome 
confirmed the above findings » romosome 

The homogeneous architecture of the heterochromatic arms of the 
megachromosome was confirmed by high resolution scanning electron 
5 mrcroscopy. Extended arm^s of megachromosomesrand the pericentric 
heterochromatic region of mouse chromosomes, treated with Hoechst 
; 33258, showed similar structure. The constitutive heterochromatic 

regions appeared more compact than the euchromatic segments. Apart 
from the terminal regions, both arms of the megachromosome were 
10 completely extended, and showed faint grooves, which should 

correspond to the border of the satellite DNA blocks in the non-amplified 
chromosomes and in the megachromosome. Without Hoechst 
treatment, the grooves seemed to correspond to the amplicon borders on 
the megachromosome arms. In addition, centromeres showed a more 

15 compact, finely fibrous appearance than the surrounding 
heterochromatin. 

4. The megachromosome of 1B3 cells contains rRNA gene 
sequence ^ 

The sequence of the megachromosome in the region of the sites of 
20 'ntegrationof the heterologous DNA was investigated by isolation of 
these regions through using cloning methods and sequence analysis of 
the resulting clones. The results of this analysis revealed that the 
heterologous DNA was located near mouse ribosorhal RNA gene (Le., 
rDNA) sequences contained in the megachromosome. 

^' 5^'!"'"? °* ""^S'^^s the megachromosomes in which 
heterologous DNA had integrated 

Megachromosomes were isolated from 1B3 cells (which were 

generated by repeated BrdU treatment and single cell cloning of 

l '^l^HE41 cells (see. Figur 4) and which contain a truncated - - 

30 m gachromosome) using fluorescence-activated cell sorting, m thods as 
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. described herein (see Example 10).: Following separation of the SATACs 
(megachromosomes) frorh the endogenous chromosomes, the isolated 
megachromosomes were stored in GH buffer (100 mM glycine, 
1% hexylene glycol, pH 8.4-8.6 adjusted with saturated calcium 
5 hydroxide solution; see Example 10) and centrifuged into an agarose bed 
in 0.5 M EDTA. 

Large-scale mapping of the megachromosome around the area of 
the site of integration of the heterologous DNA revealed that it is 
enriched in sequence containing rare-cutting enzyme sites, such as the 

10 recognition site for Not l. Additionally, mouse major satellite DNA (which 
makes up the majority of the megachromosome) does not contain Not l 
recognition sites. Therefore, to facilitate isolation of regions of the 
megachromosome associated with the site of integration of the 
heterologous DNA, the isolated megachromosomes were cleaved with 

15 Not l. a rare cutting restriction endonuclease with an 8-bp GC recognition 
site. Fragments of the megachromosome were inserted into plasmid 
pWEI B (Stratagene, La Jolla, California) as follows. Half of a 100-/yl low 
melting point agarose block (mega-plug) containing the isolated SATACs 
was digested with Not l overnight at 37 °C. Plasmid pWEI 5 was similarly 

20 digested with Not l overnight. The mega-plug was then melted and mixed 
with the digested plasmid, ligation buffer and T4 ligase. Ligation was 
conducted at t6**C overnight. Bacterial DH5a cells were transformed 
with the ligation product and transformed cells were plated onto LB/Amp 
plates. Fifteen to twenty colonies were grown on each plate for a total 

25 of 189 colonies. Plasmid DNA was isolated from colonies that survived 
growth on LB/Amp medium and was analyzed by Southern blot 
hybridization for the presence of DNA that hybridized to a pUCI 9 probe. 
This screening methodology assured that all clones, even clones lacking 
an insert but yet containing the pWEI S plasmid, would be detected. Any 
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clones containing insert DNA would be expected td contain contain non- 
satell,te. GG-rioh megachromosome DMA sequences loca«d at the site of 

.ntegratibn of the heterologous DNA. All colonic were positive for - 
hybridizjng DNA_.^^^ .^^^ . _ _ __ ____ _1 

5 '•'''"''^="l*"™=°'="183«rensforrnantsvvereusedtoge^^^^^^^^ 
cosmid minipreps for analysis of restriction sites within the insert DNA 
Six of «,e original 189 cosmid clones conatalned an Insert: These clones 
were designated as follows: 28 (-g-kb insert). 30 (- g^kb Insert. 60 
^ <-4-kb insert). 113 (-g-kb insert), 157 (-9-kb Insert) and 161 (-g-Kb 
10 .nsert). Restriction enzyme analysis indicated that three of the clones 
(113, 157 and 161) contained the same insert. 

b. *» hybridiiation experiments using isolated 
segments of the megachromosome as probes 
^ Insert DNA from clones 30. 113. 157 and 161 was purified 

labeled and ^sed as probes In ,W hybridization studies of several cell 
"•nes. Counterstalning of the cells with propidlum Iodide facilitated 
■dentification of the cytological sites of the hybridization signals. The 
locafons of the Signals detected within the cells are summarized in the 
following table: 




K20„ 
Chinese Hamster 
Cells 



No. 161 



NoV 161 



No. 30 



4-5 pairs of acrocentic chromosomes 
at centromeri c regions. 

Acrocentric ends of 4 pairs of 
chrom osomes. 

Minichromosome and the end of the 
formerly dicentric chromosome. 
Pericentric heterochromatin of one of 
the metacentric mouse chromosomes. 
Centromeric region of some of the 
other mouse chromoso mes. 

Ends of at least 6 pairs of 
chromosomes. An interstitial signal 
o" a short chromosome. 
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CELL TYPE 


PROBE 


LOCATION OF SIGNAL 


HB31 Cells 
(mouse-hamster hybrid 
cells derived from H1D3 
cells by repeated BrdU 
treatment and single 
cell cloning which 
carries the 
megachromosome) 


No. 30 


Acrocentric ends of at least 1 2 pairs 
of chromosomes. Centromeres of 
certain chromosomes and the 
megachromosome. Borders of the 
amplicons of the megachromosome. 


Mouse Spleen Cells 


No. 30 


Similar to signal observed for probe 
no. 161. Centromeres of 5 pairs of 
chromosomes. Weak cross- 
hybridization to pericentric 
heterochromatin. - 


HB31 Cells 


No. 113 


Similar to signal observed for probe 
no. 30. 


Mouse Spleen Cells 


No. 1 1 3 


Centromeric region of 5 pairs of 
chromosomes. 


K20 Cells 


No. 113 


At least 6 pairs of chromosomes. 
Weak signal at some telomeres and 
several interspersed signals. 


Human Lymphocyte 
Cells (male) 


No. 157 


Similar to signal observed for probe 
no. 161. 



c. Southern blot hybridization using isolated segments of 
the megachromosome as probes 

DNA was isolated from mouse spleen tissue, mouse LMTK" cells, 

5 K20 Chinese hamster ovary cells, EJ30 human fibroblast cells and HIDS, 

cells. The isolated DNA and lambda phage DN A, was subjected to 

Southern blot hybridization using inserts isolated from megachromosom 

clone nos. 30, 113, 157 and 161 as probes. Plasmid pWE15 was used 

as a negative control probe. Each of the four megachromosome clone 

10 inserts hybridized in a multi-copy manner (as demonstrated by the 

intensity of hybridization and the number of hybridizing bands) to all of 

the DNA samples, except the lambda phage DNA. Plasmid pWE15 

hybridized to lambda DNA only. 



wo 97/40183 ^' - 

^ : / V;°b^in fi^ subclones as ,o«ows. — " no. 16, to 

clone was digested with NotI and e=mH, , 161, the 

nn ^j- I. ^ ■ ^SmHI and ligated with Notl/BamHi 

lO digested pBluescrint ](<! /<5.— osai/Eamnl- 

; Three fragments of the insert of cW k - 

sequenced manually. However due to th • 

autor«Wi« u "^^^ *° ^"^e"- extremely high GC content 

; v: ^"*°'^^^!09raphs were difficult to interoret «nw • 

usina «n API " interpret and sequencing was repeated 

using an AB seauencer anH j ^ . h'^^aicu 

20 : - ^"''*= ''y«-«'-™nator cycle protocol A 

.^O.,^pa„son p,*e sequence data to sequences i^^ 

^^-1^^ oo.esponds o an 

X82564, whrch,s provided as SEQ ID NO ifih • -.^ 

25 <^ata Obtained for the insert Of Clone L 16 ^""'"'^ 

18-24 ci„„,r „ . '61 is setforthinSEQIDNOS. 

f^low- ^^^^^^^ 

f<^ow,„g posmons in GENBANK accession no. X82564 ,1 e SEQ ID 



...^ ■ ■ ■ r- — '-^■^'^'■--^--■^i'i^w'aiTieT^-W'JVsJ.;^ 
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Subclone 


Start 


End 


Site 


SEQtD No. 




in X82564 






1 61 k J 


7579 


7755 


iNioti, oamHI 


18 


161m5 


7756 


8494 


BamHI 


19 


161m7 


8495 


10231 


BamHI 


20 (shows only sequence 
corresponding to nt. 8495-8950)/ 

21 (shows only sequence 
corresponding to nt. 9851- 10231) 


161 ml 2 


10232 


15000 


BamHI. 


22 (shows only sequence 
corresponding to nt. 10232-10600), 

23 (shows only sequence 
corresponding to nt. 14267-15000), 


161k2 


15001 


15676 


Notl. BamHI 


24 



The sequence set forth in SEQ ID NOs. 18-24 diverges in some 

10 positions from the sequence presented in positions 7551-15670 of 

GENBANK accession no. X82564. Such divergence may be attributable 
to random mutations between repeat units of rDNA.. The results of the 
sequence analysis of clone no. 161, which reveal that it corresponds to 
rDNA, correlate with the appearance of the in situ hybridization signal it 

15 generated in human lymphocytes and mouse spleen cells. The 

hybridization signal was clearly observed on acrocentric chromosomes in 
these cells, and such types of chromosomes are known to include rDNA 
adjacent to the pericentric satellite DNA on the short arm of the 
chromosome. Furthermore, rRNA genes are highly conserved in 

20 mammals as supported by the cross-species hybridization of clone no. 
161 to human chromosomal DNA. 

To isolate amplification-replication control regions such as those 
found in rDNA, it may be possible to subject DNA isolated from 
megachromosome-containing cells, such as H1D3 cells, to nucleic acid 

25 amplification using, e.g., the polymerase chain reaction (PCR) with the 
following prim rs: 
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) ampHfication t^ntror element fdrvy^ : 

- ^'-GAGGAATTCCGCATCCCTy^ATCCAGATT GGTG-3' (SEQ ,D 25) 
. . amphficatJon^c^^^ element reverse primer (2142-21 11) 

5'_,AAACJGCAGGCCGAGGCAGCT ^ ~ - 

5 originof replication region forward primer {2116-2141) 

5'-AGGAATTCACAGAAGAGAGGTGGCTCGGCCTGC-3' (SEQ ID NO 27) 
origin of replication region reverse primer (5546-5521) 

5'-AGCCTGCAGGAAGTCATACCTGGGGAGGTGGCCC-3' (SEQ ID NO. 28) 
C. Summary of the formation of the megachromosome 
10 Figure 2 schematically sets forth events leading to the formation of 

a stable megachromosome beginning with the generation of a dicentric 
. Chromosome in a mouse LMTK' cell line: (A) A single E-type amplification 

.n the centromeric region of the mouse chromosome 7 following 
transfection of LMTK" cells with ^CM8 and ><gtWESneo generates the 
neo-centromere linked to the integrated foreign DNA, and forms a 
d,centnc chromosome. Multiple E-type amplification forms the >ineo- 
chromosome, which was derived from chromosome 7 and stabilized in a 
mouse-hamster hybrid cell line; (B) Specific breakage between the 
centromeres of a dicentric chromosome 7 generates a chromosome 
-0 fragment with the neo-centromere, and a chromosome 7 with traces of 
fore.gn DNA at the end; (C) Inverted duplication of the fragment bearing 
the neo-centromere results in the formation of a stable neo- 
minichromosome; (D) Integration of exogenous DNA into the foreign DNA 

region of the formerly dicentric chromosome 7 initiates H-type 
5 amplification, and the formation of a heterochromatic arm. By capturing 

a euchromatic terminal segment, this new chromosome arm is stabilized 

m the form of the "sausage" chromosome; (E) BrdU treatment and/or 
- drug selection-appears to-induce further H-typ^B 

r suits in the formation of an unstable gigachromosom : (F) Repeated 



15 
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BrdU treatments and/or drug selection induce further H-type amplification 

including a centromere duplication, which leads to the formation of 

another heterochromatic chromosome arm. It is split off from the 

chromosome 1 by chromosome breakage and acquires a terminal 

5 segment to form the stable megachromosome. 

D. Expression of i9-galactosidase and hygromycin transferase genes in 
cell lines carrying the megachromosome or derivatives thereof 

The level of heterologous gene (i.e., y9-galactosidase and 

hygromycin transferase genes) expression in cell lines containing the 

10 megachromosome or a derivative thereof was quantitatively measured. 

The relationship between the copy-number of the heterologous genes 

and the level of protein expressed therefrom was also determined. 

1. Materials and methods 

a. Cell lines 

15 Heterologous gene expression levels of H1D3 cells, carrying a 

250-400 Mb megachromosome as decribed above, and mM2C1 cells,; 
carrying a 50-60 Mb micro-megachromosome, were quantitatively ^ 
evaluated. mM2C1 cells were generated by repeated BrdU treatment and 
single cell cloning of the H1xHe41 cell line (mouse-hamster-human hybrid 

20 cell line carrying the megachromosome and a single human chromosome 
with CD4 and neo' genes; see Figure 4). The cell lines were grown under 
standard conditions in F12 medium under selective (120//g/ml 
hygromycin) or non-selective conditions. 

b. Preparation of cell extract for )9-ga|actosidase assays 
25 Monolayers of mM2C1 or HI D3 ceil cultures were washed three 

times with phosphate-buffered saline (PBS). Cells were scraped by 
rubber policemen and suspended and washed again in PBS. Washed 
cells were resuspended into 0.25 M Tris-HCI, pH 7.8, and disrupted by 
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three cycles^ 6f freezing iri Jic^uid nitrogen and th^win^^ ^he ^ 

• extract was clarified by cehtrifugation at 1 2,000 rpm for 5 min. at 40C. 

c. ^-galacosidase assay 
11 _ The ^.galactosidase-assay mixture-contained r mM MgCTr -"- ' 
45 mMJ?-mercaptoethanol, 0.8 mg/ml o-nitropheny.-/^D-galactopyrano- 
;s.de and 66 mM sodium phosphate, pH 7.5. After incubating the reac: 
t.on mixture with the cell extract at 37oc for increasing time, the reac- 
t.on was terminated by the addition of three volumes of TM Na.CO, and 
the opfcal density was measured at 420 nm. Assay mixture incubated 
wthout cell extract was used as a control. The linear range of the reac- 

t,on was determined to be between 0.1-0.8 OD,,o. One unit of ^-galac- 
tos.dase activity Is defined as the amount of enzyme that will hydrolyse 
3 nmoles of o-nitrophenyl^D-galactopyranoslde In 1 minute at 37 OQ. , 
15 Preparation of cell extract for hygromycin 

pnosphotransferase assay 
Cells were washed as described above and resuspended into 20 
;r- mM Hepes butfer; pH 7.3; 1 00 mM potassium acetate. S mM Mg acetate 
and 2 mM dithiothreitol). Cells were disrupted at 0°C by six 10 sec 
bursts in an MSE ultrasonic disintegrator using a microtip probe. Cells 
were allowed to cool for 1 min after each ultrasonic burst. The extracts 
v^re Clarified by centrifuging for 1 min at 2000 rpm in a microcentrifuge. 
®- "y9''°'"y9'" phosphotransferase assay 
Enzyme activity was measured by means of the phosphocellulose 

paper b.nd.ng assay as described by Haas and Dowding [(1975) Meth_ 
25 gnzymoL 43:61 1-628]. The cell extract was upplemented with 0 1 M 
ammonium chloride and 1 mM adenosine-K--P-triphosphate (specific 
activity: 300 Cl/mmol). The reaction was initiated by the addition of O 1 
mg/ml hygromycm and Incubated fpMncreasing. time.at 37 -C. - The - 
reaction was terminated by heating the samples for 5 min at 75oc in a 
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water bath, and after removing the precipitated proteins by centrifugation 

for 5 nnin in a microcentrifuge, an aliquot of the supernatant was spotted 

on a piece of Whatman P-81 phosphocellulose paper (2 cm^). After 30 

sec at room temperature the papers are placed into 500 ml of hot (75 °C) 

5 distilled water for 3 min. While the radioactive ATP remains in solution 

under these conditions, hygromycin phosphate binds strongly and 

quantitatively to phosphocellulose. The papers are rinsed 3 times in 500 

ml of distilled water and the bound radioactivity was measured in toluene 

scintillation cocktail in a Beckman liquid scintillation counter. Reaction 

10 mixture incubated without added hygromycin served as a control. ^ 

f. Determination of the copy-number of the heterologous 
genes 

DNA was prepared from the H1D3 and mM2C1 cells using 
standard purification protocols involving SDS lysis of the cells followed 

15 by Proteinase K treatment and phenol/chloroform extractions. The 

isolated DNA was digested with an appropriate restriction endonuclease, 
fractionated on agarose gels, blotted to nylon filters and hybridized with 
a radioactive probe derived either from the iff-galactosidase or the 
hygromycin phosphotransferase genes. The level of hybridization was 

20 quantified in a Molecular Dynamics Phosphorlmage Analyzer. To control 
the total amount of DNA loaded from the different cells lines, the filters 
were reprobed with a single copy gene, and the hydridization of jS- 
galactosidase and hygromycin phpsphotransferase genes was normalized 
to the single copy gene hybridization. 

25 g. Determination of protein concentration , , 

The total protein content of the cell extracts was measured by the 
Bradford colorimetric assay using bovine serum albumin as standard. 
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^ SJ'r^?^"^^ of the ^galactosidase and hygromycin 

^^^^"^^^^^^^^"^^ 

(n order to establish quantative conditions^ the most import^^^ 
kinetic parameters bf>gilactosidise and hygromycin' 

phosphotransferase activity have been studied. The )ff-galactosidase 
activity measured with this colorimetric assay was linear between the 

0.1-0.8 OD„o range both for the nlVI2C1 and H1D3 ceil lines. The ;ff- 
galactosidase activity was also proportional in both cell lines with the 
amount of protein added to the reaction mixture within 5-100 //g total 
protein concentration range. The hygromycin phosphotransferase 
activity of nI^2C1 and H1D3 cell lines was also proportional with the 
reaction time or the total amount of added ceil extract under the 
conditions described for the ^-galactosidase. 

a. Comparison of /?-galactosldase activity of mM^^ 
rnu3 cell lines 

Cell extracts prepared from logarithmically growing mM2CI and 
HI D3 cell lines were tested for ^-galactosidase activity, and the specific 
activities were compared in 10 independent experiments. The,ff- 
20 galactosidase activity of H1D3 cell extracts was 440 ±25 U/mg total 
protein. Under identical conditions the ;ff-galactosidase activity of the 
mlVI2C1 cell extracts was 4.8 times lower: 92± 13 U/mg total protein. 

^-galactosidase activities of highly subconfluent, subconfluent and 
nearly confluent cultures of H1D3 and mM2Cl cell lines were also 
25 compared. In these experiments different numbers of logarithmic HI D3 
and mM2C1 ;cells were seeded in constant volume of culture medium and 
grown for 3 days under standard conditions. No significant difference 
was found in the ^-galactosidase specific activities of cell cultures grown 
"^''^'^'^"^ '^^l' ^"d the ratio of H1D3/mM2G1 ;ff-gaiactosidase 

30 specific activities was also similar for all three cell densiti s. In 
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confluent, stationary cell cultures of H1D3 or nnM2C1 cells, however, the 

expression of )ff-galactosidase significantly decreased due likely to 

cessation of cell division as a result of contact inhibition. 

b. Comparison of hygromycin phosphotransferase 
5 activity of H1D3 and mM2C1 cell lines 

The bacterial hygromycin phosphotransferase is present in a 

membrane-bound form in H1D3 or mM2C1 cell lines. This follows from 

the observation that the hygromycin phosphotransferase activity can be 

completely removed by high speed centrifugation of these ceil extracts, 

10 and the enzyme activity can be recovered by resuspending the high 

speed pellet. 

The ratio of the enzyme's specific activity in H1D3 and mM2C1 

cell lines was similar to that of )ff-galactosidase activity, i.e., Hlp3 cells 

have 4.1 times higher specific activity compared with mM2C1 cells. 

15 c. Hygromycin phosphotransferase activity in H1D3 and. 

mM2C1 ceils grown under non-selective conditions 

The level of expression of the hygromycin phosphotransferase c; 

gene was measured on the basis of quantitation of the specific enzyme 

activities in H1D3 and mM2G1 cell lines grown under non-selective 

20 conditions for 30 generations. The absence of hygromycin in the 

mediurin did not influence the expression of the hygromycin 

phosphotransferase gene. 

3« Quaintitation of the number of )?-galactosidase and 

hygromycin phosphotransferase gene copies in H1D3 and 
25 mM2C1 cell lines 

As described above, the )ff-galactosidase and hygromycin 

phosphotransferase genes are located only within the megachromosome, 

or micro-megachromosome in H1D3 and mM2C1 cells. Quantitative 

analysis of genomic Southern blots of DNA isolated from H1D3 and 

30 mM2CI cell lines with the Phosphorlmage Analyzer revealed that the 
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copy number of ^-galactosidase genes integrated into the megachromo- 
some .s approximately 10 times higher in H1D3 cel^^^ 

cells: The copy-number of hygrdmycin phosphotransferase genes is 
^ - - - ^PP^°X'n^ately-7 times higher in HI DS cill^^^^^^ " " - " 

5 4. Suniniary and conclusions of results of quantitation of 

heterologous gene expression in cells containing^^^ 
megachromosomes or derivatives thereof 

Quantitative determination of ^-galactosidase activity of higher 
_^ eukaryotic cells, (e^, "^^^ «lls) carrying the bacterial /?-galactosidase 
10 gene m heterochromatic megachromosomes confirmed the observed 
h,gh-level expression of the integrated bacterial gene detected by 
cytological staining methods. It has generally been established in reports 
of stud.es of the expression of foreign genes in transgenic animals that 
^ although transgene expression shows correct tissue and developmental 
15 specificity, the level of expression is typically low and shows extensive 
pos,t,on-dependent variability (i.e., the level of transgene expression 
depends on the site of chromosomal integration). It is generally assumed 
that the low-lever transgene expression may be due to the absence of 
special DNA sequences which can insulate the transgene from the 
20 .nhibitory effect of the surrounding chromatin and promote the formation 
of active chromatin structure required for efficient gene expression 
Several cis-activing DNA sequence elements have been identified which 
can abolish this position-dependent variability, and can ensure high-level 
expression of the transgene locus activing region (LAR) sequences in 
25 higher eukaryotes and specific chromatin structure (scs) elements in 
lower eukaryotes (see, e.g., Eissenberg and Elgin (1 991) Trends Jn 
fieo^ 7:335-340). If these cis-acting DNA sequences are absent, the 
'evel transgene expression is lovv and copy-number independent. ^ ' 
Although the bacterial /?-galactosidase reporter gene contained in 
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driven by a potent eukaryotic promoter-enhancer element, no specific cis- 
acting DNA sequence element was designed and incorporated into the 
bacterial DNA construct which could function as a boundary element. 
Thus, the high-level >ff-galactosidase expression measured in these cells is 
5 of significance, particularly because the )9-galactosidase gene in the 
megachromosome is located in a long, compact heterochromatic 
environment, which is known tp be able to block gene expression. The 
megachromosome appears to contain DNA sequence element{s) in 
association with the bacterial DNA sequences that function to override 

10 the inhibitory effect of heterochromatin on gene expression. 

The specificity of the heterologous gene expression in the 
megachromosome is further supported by the observation that the level 
of )ff-galactosidase expression is copy-number dependent. In the H1D3 ' 
cell line, which carries a full-size megachromosome, the specific activity 

15 of ;ff-galactosidase is about 5-fold higher than in mM2C1 cells, which 

carry only a smaller, truncated version of the megachromosome. A J\ 
comparison of the number of >ff-galactosidase gene copies in HI D3 and 
mM2C1 cell lines by quantitative hybridization techniques confirmed that 
the expression of >9-galactosidase is copy-number dependent. The 

20 number of integrated )?-galactosidase gene copies is approximately 10- 
fold higher in the H1D3 cells than in mM2C1 cells. Thus, the cell line 
containing the greater number of copies of the )ff-galactosidase gene also 
yields higher levels of )ff-galactosidase activity, which supports the copy- 
number dependency of expression. The copy number dependency of the 

25 iS-galactosidase and hygromycin phosphotransferase enzyme levels in cell 
lines carrying different derivatives of the megachromosome indicates that 
neither the chromatin organization surrounding the site of integration of 
the bacterial genes, nor the heterochromatic environment of th 
megachromosome suppresses the expression of the genes. 
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^ The relative amoun. of ^-galactosidese pro,e,„ expressed in H1D3 
cells can be estimated based on the V - of this enzyme 1500 for 

homogeneous,crys,allized bacterial 4r-galac,osidase (Naider a al ,,972) 
^.^ mssbS!ma^l2,3202-32,0H and the specifio activity of HI DiceF " 
protein. A V_ of 500 means that the homogeneous /P-galactosldase 
protein hydrolyzes SOO^moies of substrate per minute per mg of enzyme 
pro.e,n at 37»C. One mg of to^l H1D3 cell protein extract can 
hydrolyze 1 .4 ^moles of substrate per minute at 37oc; which means th« 
0.28 ^^of the protein present In the H1D3 cell extract is ^-galactosidase 

^ "y^romycin Phosphotransferase is present in a membrane- 

bound form in H1D3 and mM2Cl cells. The tendency of the enzyme to 
■ntegrate into membranes in higher eukaryotic cells may be related to its 
penplasmic localization in prokaryotic cells. The bacterial hygromycin 
Phosphotiansferase has not been purified to homogeneity; thus, its V 
has no. been determined. Therefore, no estimate can be made on the""" 
total amount of hygromycin phosphoo-ansferase protein expressed in 

these cell lines. The 4-fold higher specific activity Of hygromycin 
Phosphotransferase in H1D3 cells as compared to mM2cV cells 
however, indicates that its expression is also copy number dependent 

-^^The constant and high level expression of ihe ^galactosidase gene 
•n H1D3 and mM2C1 cells, particularly in the absence of any selective 
pressure for the expression of this gene, clearly indicates the stability of 
the expression of genes carried in the heterochromatic megachromo- 
somes. This conclusion is further supported by the observation that ttie 
level of hygromycin Phosphotransferase expression did not change when 
H1D3 and mM2C1 cells were grown under non-selective conditions. The 
consistent high-level, stable, and copy-number depend nt xpression of 
bacterial mark r genes cleariy indicates that the megachromosome iVan 
•deal V ctor syst m for expression of foreign genes. 
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EXAMPLE 7 

Summary of some of the cell lines with SATACS and minichromosomes 
that have been constructed 

5 1. EC3/7-Derived celMines . 

The LMTK-derived cell line, which Is a mouse fibroblast cell line, 
was transfected with /ICM8 and /tgtWESneo DNA Isee, EXAMPLE 2J to 
produce transformed cell lines. Among these cell lines was EC3/7, 
deposited at the European Collection of Animal cell Culture (ECACC) 
10 under Accession No. 90051001 [see, U.S. Patent No. 5,288,625; see, 
also Hadlaczky et al. (1991) Proc. Natl. Acad. Sci. U.S.A. 88;Rinfi-Biif> 
and U.S. application Serial No. 08/375,271]. This cell line contains the 
dicentric chromosome with the neo-centromere. Recloning and selection 
produced cell lines such as EC3/7C5, which are cell lines with the stabi 
15 neo-minlchromosome and the formerly dicentric chromosome [see. Fig. 
•' ' 2C1. - . :, /-^ 

2. KE 1-2/4 Cells w 

Fusion of EC3/7 with CH0-K20 cells and selection with G418/H AT 
produced hybrid cell lines, among these was KE 1-2/4, which has been 

20 deposited with the ECACC under Accession No. 96040924. KE1-2/4 is 
a stable cell line that contains the >lneo-chromosome [see. Fig. 2D; see, 
also U.S. Patent No. 5,288,625], produced by E-type amplifications. - 
KE1-2/4 has been transfected with vectors containing yl DNA, selectable 
markers, such as the puromycin-resistance gene, and genes of interest, 

25 such as p53 and the anti-HIV ribozyme gene. These vectors target the 
gene of interest into the /Ineo-chromosome by virtue of homologous 
recombination with the heterologous DNA in the chromosome. 
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3. C5pMCT53 Cells 
t ; T^^^^ has been co-transfected with pH132 

PCH1 10 and. DNA [see, EXAMPLE 2] as well as other constructs 
^Venous clones-and subclones have been selected: For example " ' ~ " 
transfornnatlon with a construct that includes p53 encoding DNA 
produced cells designated C5pMCT53. 
4. TF1004G24 Cells . 

As discussed above, cotransfection of EC3/7C5 cells with 
. Plasmids [PH132, PCH110 available from Pharmacia, see, also Hal. 
(1983) J. Mol.AppI Gen 2:101-109] and with >l DNA Wcl 875 Sam 7 
(New England Biolabs)] produced transformed cells. Among these is 
TF1004G24, which contains the DNA encoding the anti-HIV ribozyme in 
the neo-mlnichromosome. Recloning of TF1004G24 produced numerous 
cell l,nes, Among these is the NHHL24 cell line. This cell line also has 
the anti-HIV ribozyme in the neo-minichromosome and expresses high 
levels Of /^gal. It has been fused with CH0-K20 cells to produce various 
hybnds. 

5. TF1004G19-Derived cells 

Recloning and selection of the TF1004G transformants produced 

20- the celllineTFiO04Q19. discussed above in EXAMPLE 4 which 
contains the unstable sausage chromosome and the neo- 
: minichromosome. Single cell cloning produced the TF1004G-19C5 (see 

- -«ble sausage chromosome and the neo- 

^ m,nchromosome. TF1004G-19C5 has been fused with CHO cells and 
the hybrids grown under selective conditions to produce tt,e 1 9C5xHa4 
and 19C5xHa3 cell lines tsee. EXAMPLE 4) and others. Recloning of the 
19eSxHa3 cell line yielded a cell line containing a gigachromosome i e 

- - cell line 19G5xHa47. see Figur 2E. BrdU treatment of 1 9C5xHa4 cells' 
and growth under selective conditions (neomycin (G) and/or hygromycin 
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(H)] has produced hybrid cell lines such as the G3D5 and G4D6 cell lines 
and others- G3D5 has the neo-minichromosome and the 
megachromosome. G4D6 has only the neo-minichromosome. 

Recloning of 19C5xHa4 cells in H medium produced numerous 
5 clones. Among these is H1D3 [see Figure 4], which has the stable 
megachromosome. Repeated BrdU treatment and recloning of H1D3 
cells has produced the HB31 cell line, which has been used for 
transformations with the pTEMPUD, pTEMPU, pTEMPU3, and pCEPUR- 
132 vectors [see. Examples 12 and 14, below]. 

10 H1D3 has been fused with a CD4* Hela cell line that carries DNA 

encoding CD4 and neomycin resistance on a plasmid [see, e.g. , U.S. 
Patent Nos. 5,413,914, 5,409,810, 5,266,600, 5,223,263, 5,215,914 
and 5,144,019, which describe these Hela cells]. Selection with GH has 
produced hybrids, including H1xHE41 [see Figure 4], which carries the 

15 megachromosome and also a single human chromosome that includes 
the CD4neo construct. Repeated BrdU treatment and single cell cloning 
has produced cell lines with the megachromosome [cell line 1 83, see 
Figure 4]. About 25% of the 183 cells have a truncated 
megachromosome [ — 90-1 20 Mb]. Another of these subclones, 

20 designated 2C5, was cultured on hygromycin-containing medium and 
megachromosome-free cell lines were obtained and grown in G418- 
containing medium. Recloning of these cells yielded cell lines such as 
184 and others that have a dwarf megachromosome [ — 150-200 Mb], 
and cell lines, such as II C3 and mM2C1, which have a micro- 

25 megachromosome [-50-90 Mb]. The micro-megachromosome of cell 



10 



line mM2Cr has hb telomeres; however, if desired; synthitic telomeres, 
such as those described and generated hereK ^nay be added to the 
mM2CT cell micro-megachromosomes. Cell lines conta^^^^ 

-truncated megachromosomes, such-as the-mM2C1-ceir iine cbntaiRing ^ 
the micro-megachromosome, can be used to generate even smaller 
megachromosomes, e.g., -10-30 Mb in size. This may be 
accomplished, for example, by. breakage and fragmentation of the micro 
megachromosome in these cells through exposing the cells to X-ray 
irradiation, BrdU or telomere-directed in vjyo chromosome fragmentation 
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EXAMPLE 8 
Replication of the megachromosome 

The homogeneous architecture of the megachromomes provides a 
unique opportunity to perform a detailed analysis of the replication of the 
5 constitutive heterochromatin: 
A. Materials and methods 

1. Culture of cell lines 

H1D3 mouse-hamster hybrid cells carrying the megachromosome 
fsee, EXAMPLE 4] were cultured in F-12 medium containing 10% fetal 
calf serum [FCSI and 400 //g/ml Hygromycin B [Calbiocheml. G3D5 
hybrid cells [see. Example 4] were maintained in F-12 medium containing 
10% FCS, 40O//g/ml Hygromycin B (Calbibchem); and 400 //g/ml G418 
fSIGMA]. Mouse A9 fibroblast cells were cultured in F-12 medium 
supplemented with 10% FCS. 
2. BrdU labelling 

In typical experiments, 20-24 parallel semi-confluent cell cultures 
were set up in 10 cm Petri dishes, Bromodeoxyuridine (BrdU) (Fluka) 

was dissolved in distilled water alkalized-v^lth-a-drdp-of NaOH,^ 
1 0-^ M stock solution. AJiquots of 1 0-50 u\ of this BrdU stock solution 
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were added to each 10 ml culture, to give a final BrdU concentration of 
10-50 //M. The cells were cultured in the presence of BrdU for 30 min, 
and then washed with warm complete medium, and incubated without 
BrdU until required. At this point, 5 //g/ml colchicine was added to a 
5 sample culture every 1 or 2 h. After 1-2 h colchicine treatment, mitotic 
cells were collected by "shake-off" and regular chromosome preparations 
were made for immunolabelling. . . ^ ^ . 

3. Immunoiabelling of chromosomes and in s^i/ hybridization 
Immunolabelling with fluorescein-conjugated anti-BrdU monoclonal 

10 antibody (Boehrlnger) was done according to the manufacturer's 
recommendations, except that for mouse A9 chromosomes, 
2 M hydrochloric acid was used at 37° C for 25 min, while for 
chromosomes of hybrid cells, 1 M hydrochloric acid was used at 37° G 
for 30 min. In situ hybridization with biotin-labelled probes, and indirect 

15 immunofluorescence and in situ hybridization on the same preparation, 
were performed as described previously [Hadlaczky et aL (1991) Proc,^ 
Natl. Acad. Sci. U.S.A. 88 :8106-8110. see/ aliso U.S. Patent No. 
5,288,625]. 

4. Microscopy 

20 All observations and microphotography were made by using a 

Vanox AHBS (Olympus) microscope. Fujicolor 400 Super G or Fujicolor 
1600 Super HG high-speed colour negatives were used for photographs. 
B. Results 

The replication of the megachromosome was analyzed by BrdU 
25 pulse labelling followed by immunolabelling. The basic parameters for 
DNA labelling in vivo were first established. Using a 30-min pulse of 
50 jL/M BrdU in parallel cultures, samples were taken and fixed at 5 min 
intervals from the beginning of the pulse, and every 1 5 min up to 1 h 
after the removal of BrdU. Incorporated BrdU was detected by 
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. Immuholabelling with fluorescein-conjugated ahti-BrdU monoclonal 

antibody. At the first time point (5 min) 38% of the nuclei were labelled. 
^"^^ S''^^"^' '"crease in the number of labelled nuclei was observed 
during -incubation in the presence- of BrdUT culmihating^m 
y 5 min sample, at the time of the removal of BrdU. At further time points 
> (60; 75, and 90 min) no significant changes were observed, and the 
fraction of labelled nuclei remained constant [44.5-46%]. 

These results indicate that (i) the incorporation of the BrdU is a 
rapid process, (ii) the 30 min pulse-time is sufficient for reliable labelling 
10 of S-phase nuclei, and (iii) the BrdU can be effectively removed from the 
, cultures by washing. 

The length of the cell cycle of the HI D3 and G3D5 cells was 
estimated by measuring the time between the appearance of the earliest 
BrdU signals oh the extreme late replicating chromosome segments and 
15 the appearance of the same pattern only on one of the chromatids of th 
chromosomes after one completed cell cycle. The length of G2 period 
was determined by the time of the first detectable BrdU signal on 
prophase chromosomes and by the labelled mitoses method [Qastler et 
aL (1959) Exp. Cell Res. 17:420-438]. The length of the S-phase was 
20 determined in three ways: (i) on the basis of the length of cell cycle and 
the fraction of nuclei labelled during the 30-120 min pulse; (ii) by 

measuring the time between the very end of the replication of the 
extreme late replicating chromosomes and the detection of the first signal 
oh the chromosomes at the beginning of S phase; (iii) by the labelled 
25 mitoses method. In repeated experiments, the duration of the cell cycle 
was found to be 22-26 h, the S phase 10-14 h, and the G2 phase 3.5- 
4.5 h. 



WO97/40183 



PCTAJS»7/a5911 



Analyses of the replication of the megachromosome were made in 
parallel cultures by collecting mitotic cells at two hour intervals following 
two hours of colchicine treatment. In a repeat experiment; the same 
analysis was performed using one hour sample intervals and one hour 
5 colchicine treatment. Although the two procedures gave comparable 
results, the two hour sample intervals were viewed as more appropriate 
since approximately 30% of the cells were found to have a considerably 
shorter or longer cell cycle than the average. The characteristic 
replication patterns of the individual chromosomes, especially some of 

10 the late replicating hamster chromosomes, served as useful internal 
markers for the different stages of S-phase. To minimize the error 
caused by the different lengths of cell cycles in the different experiments, 
samples were taken and analyzed throughout the whole cell cycle until 
the appearance of the first signals on one chromatid at the beginning of 

15 the second S-phase. 

The sequence of replication in the megachromosome is as follovys . 
At the very beginning of the S-phase, the replication of the 
megachromosome starts at the ends of the chromosomes. The first 
initiation of replication in an interstitial position can usually be detected at 

20 the centromeric region. Soon after, but still in the first quarter of the S- 
phase, when the terminal region of the short arm has almost completed 
its replication, discrete initiation signals appear along the chromosome 
arms. In the second quarter of the S-phase, as replication proceeds, the 
BrdU-labelled zones gradually widen, and the checkered pattern of the 

25 megachromosome becomes clear [see, e.o, . Fig. 2F]. At the same time, 
pericentric regions of mouse chromosomes also show intense 
incorporation of BrdU. The replication of the megachromosome peaks at 
the end of the second quarter and in the third quarter of the S-phase. At 
th nd of the third quarter, and at the very beginning of the last quarter 
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vr- ? Of the S-phase, the megachromosome and the pericentric 

. heterochromatin of the mouse chromosomes complete their replication. 

By the end of S^phase, only, the very late replicating segments of mouse 
_i _ _ _ J"-^ A^instei^chromospmes incorporatingrBrdUv- - - - - - 

5 "^he replication of the whole genome occurs in distinct phases. 

V The signal of incorporated BrdU increased continuously until the end of 
the first half of the S,phase, but at the beginning of the third quarter of 
the S-phase chromosome segments other than the heterochromatic. 
regions hardly incorporated BrdU. In the last quarter of the S-phase, the 
10 BrdU signals increased again when the extreme late replicating segments 
showed very intense incorporation. 

Similar analyses of the replication in mouse A9 cells were 
performed as controls/ To increase the resolution of the immunolabelling 
pattern, pericentric regions of A9 chromosomes were decondensed by 
15 treatment with Hoechst 33258. Because of the intense replication of the 
surrounding euchromatic sequences, precise localization of the initial 
BrdU signal in the heterochromatin was normally difficult, even on 
undercondensed mouse chromosomes. On those chromosomes where 
the initiation signal(s) were localized unambiguously, the replication of 
20 the pericentric heterochromatin of A9 chromosomes was similar to that 
of the megachromosome. Chromosomes of A9 cells also exhibited 
replication patterns and sequences similar to those of the mouse 
chromosomes in the hybrid cells. These results indicate that the 

replicators of the megachromosome and mouse chromosomes retained 
25 their original timing and specificity in the hybrid cells. 

By comparing the pattern of the initiation sites obtained after BrdU 
incorporation with the location of the integration sites of the "foreign" 
^ ^DNAjaajd iailed-analysis of -th first quarter of the S-phase; an attempt- 
was made to id ntify origins of replication (initiation sites) in relation to 
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the amplicon structure of the megachromosome. The double band of 
integrated DNA on the long arm of the megachrorhosome served as a 
cytological marker. The results showed a colpcalization of the BrdU and 
in situ hybridization signals found at the cytological level, indicating that 
5 the "foreign" DNA sequences are in close proximity to the origins of 
replication, presumably integrated into the non-satellite sequences 
between the replicator and the satellite sequences [see. Figure 3]. As 
described in Example 6. B. 4, the rDNA sequences detected in the 
megachromosome are also localized at the amplicon borders at the site of 

10 integration of the "foreign" DNA sequences, suggesting that the origins 
of replication responsible for initiation of replication of the 
megachromosome involve rDNA sequences. In the pericentric region of 
several other chromosomes, dot-like BrdU signals can also be observed 
that are comparable to the initiation signals on the megachromosome. 

15 These signals may represent similar initiation sites in the heterochromatic 
regions of normal chromosomes. % 

At a frequency of lO '^, "uncontrolled" amplification of the 
integrated DNA sequences was observed in the megachromosome. 
Consistent with the assumption (above) that "foreign" sequences are in 

20 proximity of the replicators, this spatially restricted amplification is likely 
to be a consequence of uncontrolled repeated firings of the replication 
origin(s) without completing the replication of the whole segment. 
C. Discussion 

It has generally been thought that the constitutive heterochromatin 

25 of the pericentric regions of chromosomes is late replicating [see, e.g. . 
Miller (1976) Chromosoma 55 :165-1701. On the contrary, these 
experiments evidence that the replication of the heterochromatic blocks 
starts at a discret initiation site in the first half of the S-phase and 
continues through approximately three-quarters of S-phase. This 



wo 97/40183 



PCT/DS97/05911 



■128- 



- chromosomes, actively replicating euchromatic sequences that surround 
^ 3 . .the satellite DN A obscure the initiation signals, and thus the precise 
■— ' -'°calization:of: initiation sites is obscured; (ii) replicatibn Ihi " " " — " ^ 
5 heterochromatin can only be detected unambiguously In a period during 

the second half of the S-phase, when the bulk of the heterochromatin 
replicates and most other chromosomal regions have already completed 
their replication/ or have not yet started it. Thus, low resolution 
cytological techniques, such as analysis of incorporation of radioactively 

10 labelled precursors by autoradiography, only detect prominent replication 
signals in the heterochromatin |n the second half of S-phase, when 
adjacent euchromatic segments are no longer replicating. 

In the megachromosome, the primary initiation sites of replication 
colocalize with the sites where the "foreign" DNA sequences and rDNA 

15 sequences are integrated at the amplicon borders^ Similar initiation 
signals were observed at the same time in the pericentric 
heterochromatin of some of the mouse chromosomes that do not have 

"foreign" DN A, indicating that the replication initiation sites at the 
borders of amplicons may reside in the non-satellite flanking sequences 

20 of the satellite DMA blocks. The presence of a primary initiation site at 
each satellite DIMA doublet Implies that this large chromosome segment is 
a single huge unit of replication [megareplicon] delimited by the primary 
initiation site and the termination point at each end of the unit. Several 
lines of evidence indicate that, within this higher-order replication unit, . 

25 "secondary" origins and replicons contribute to the complete replication 
of the megareplicon: 

1. The total replication time of the het rochromatic regions of 
the megachromosome was -9-rr h.- yKt tHe rto of m6v^ 
replication forks, 0.5-5 kb per minute; that is typical of eukaryotlc 



AVOy7/40183 PGTAJSOT/05911 

\ ;,-.i29- , . / ■ ; , . 

chromosomes [Kornberg et ^ (1 992) DNA Replication^ 2nd. ed... New 
York: W.H. Freeman and Co, p. 474L replication of a -15 Mb replicon 
would require 50-500 h. Alternatively, if only a single replication origin 
was used, the average replication speed would have to be 25 kb per 
5 minute to complete replication within 10 h. By comparing the intensity 
of the BrdU signals on the euchrornatic and the heterochromatic 
chromosome segments, no evidence for a 5- to 50-fold difference in th ir 
replication speed was found; „ 

2. Using short BrdU pulse labelling, a single origin of replication 

10 would produce a replication band that moves along the replicon, 

reflecting the movement of the replication fork. In contrast, a widening 
of the replication zone that finally gave rise to the checkered pattern of 
. the megachromosome was observed, and within the replication period, 
the most intensive BrdU incorporation occurred in the second half of the 

15 S-phase. This suggests that once the megareplicator has been activated, 
it permits the activation and firing of "secondary" origins, and that th^ 
replication of the bulk of the satellite DNA takes place from these 
"secondary" origins during the second half of the S-phase. This is 
supported by the observation that in certain stages of the replication of 

20 the megachromosome, the whole ampiicon can apparently be labelled by 
a short BrdU pulse. , 

Megareplicators and secondary replication origins seem to be 
under strict temporal and spatial control. The first initiation within the 
megachromosomes usually occurred at the centromere, and shortly 

25 afterward ail the megareplicators become active. The last segment of 
the megachromosome to complete replication was usually the second 
segment of the long arm. Results of control experiments with mouse A9 
chromosomes indicate that replication of the heterochromatin of mouse 
chromosomes corresponds to the replication of the megachromosome 
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amplicons. Therefore, the pre-existing temporal cohtrol of replication in 
: V Jhe heterochromatic blocks is preserved in the megachrbmosome 

Positive [Hassan et aL (1 994) J. Cell. Sci. 1 n7-A7P;-/L^A] nrnntivc 
- - , - [Haase et aL ( 1 994) Mol. Gellr Biol, 1^:251 6^2524] correlatibhs " - - - 
5 between transcriptional activity and initiation of replication have been 
proposed. In the megachromosome, transcription of the integrated gen s 
seems to have no effect on the. original timing of the replication origins. 
The concerted, precise timing of the megareplicator initiations in the 
different amplicons suggests the presence of specific, cis-acting 
10 sequences, origins of replication. 

Considering that pericentric heterochromatin of mouse 
chromosomes contains thousands of short, simple repeats spanning 7- 
1 5 Mb, and the centronrwre itself may also contain hundreds of kilpbases, 
the existence of a higher-order unit of replication seems probable. The 
15 observed uncontrolled intrachromosomal amplification restricted to a 

replication initiation region of the megachromosome is highly suggestive 
of a rolling-circle type amplification, and provides additional evidence for 
the presence of a replication origin in this region. 

The finding that a specific replication initiation site occurs at the 
20 boundaries of amplicons suggests that replication might play a role in th 
amplification process. These results suggest that each amplicon of the 
megachromosome can be regarded as a huge megareplicon defined by a 
primary initiation site [megareplicator] containing "secondary" origins of 

replication. Fusion of replication bubbles from different origins of bi- 
25 directional replication [DePamphilis (1 993) Ann. Rev. Biochem 67:Pq.fi.qi 
within the megareplicon could form a giant replication bubble, which 
would correspond to the whole megareplicon. In the light of this, the 
formation of megabase^size amplicons can be accommodated by a 
replication-dir cted amplification mechanism. In H and E-type 
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amplifications, intrachromosomal multiplication of the amplicons was 
observed [see, above EXAMPLES], which is consistent with the unequal 
sister chromatid exchange model. Induced or spontaneous unscheduled 
replication of a megareplicon in the constitutive heterochromatin may 
5 also form new amplicon(s) leading to the expansion of the amplification 
or to the heterochromatic polymorphism of "normal" chromosomes. The 
"restoration" of the missing segment on the long arm of the 
megachromosome may well be the result of the re-replication of one 
amplicon limited to one strand. 

10 Taken together, without being bound by any theory, a replication- 

directed mechanism is a plausible explanation for the initiation of large- 
scale amplifications in the centromeric regions of mouse chromosomes, 
as well as for the de novo chromosome formations. If specific [amplifi- 
cator, i.e. . sequences controlling amplification] sequences play a role in 

15 promoting the amplification process, sequences at the prirnary replication 
initiation site [megareplicator] of the megareplicon are possible ^ 
candidates. 

The presence of rRNA gene sequence at the amplicon borders near 
the foreign DNA in the megachromosome suggests that this sequence 

20 contributes to the primary replication initiation site and participates in 
large-scale amplification of the pericentric heterochromatin in de novo 
formation of SATACs. Ribosomal RNA genes have an intrinsic 
amplification mechanism that provides for multiple copies of tandem 
genes. Thus, for purposes herein, in the construction of SATACs in 

25 cells, rDNA will serve as a region for targeted integration, and as ■ ^ 
components of SATACs constructed in vitro . 
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]■ '-[-j-- '' ,;;:.{':' ; ^ example 9 .'^^ ''^ 'T 

_;i_v;,,^Io_show EXAMPLES .2-7-are-not^ 

= 5 unique to mouse chromosome 7 and to show that the EC7/3 cell line is 
: . ,n of the artificial chromosomes, the experiments 

have been repeated using different initial cell lines and DNA fragments. 
Any cell or cell line should be amenable to use or can readily be 
determined that it is not. 
10 A. Materials 

The LPI I cell line was produced by the "scrape-loading " 
- transfection method fFechheimer et aL (1 987) Proc. Natl, Ar.^H <^r^i 
: using 25 ;yg plasmid DNA for 5 x 10« recipient 

cells; LP1 1 cells were maintained in F-12 medium containing 3-15 //g/ml 
^;15 Puromycin [SIGMA]. 

B. Amplification in LP 11 cells 

The large-scale amplification described in the above Examples is 
not restricted to the transformed EC3/7 cell line or to the chromosome 7 
of mouse. In an independent transformation experiment, LMTK" cells 
20 were transf acted using the calcium phosphate precipitation procedure 
with a selectable puromycin-resistance gene-containing construct desig- 
nated pPuroTel [see Example 1 .E.2. for a description of this plasmid], to 
establish cell line LP! 1 , Cell line LP1 1 carries chromosome(s) with 
amplified chromosome segments of different lengths [-150-600 Mb]. 
25 Cytological analysis of the LP1 1 cells indicated that the amplification 
occurred in the pericentric region of the long arm of a submetacentric 
chromosome formed by Robertsonian translocation. This chromosome 
.w^s identified by G-banding as chromosome 1 . C-banding and />7 
s/tu hybridization with mouse major satellite DNA probe showed that an 
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E-type amplification had occurred: the newly formed region was 

composed of an array of euchromatic chromosome segments containing 

different amounts of heterochromatin. The size and C-band pattern of 

the amplified segments were heterogeneous. In several cells, the number 

5 of these amplified units exceeded 50; single-cell subclones of LP1 1 cell 

Imes, however, carry stable marker chromosomes with 10-15 segments 

and constant C-band patterns. 

Sublines of the thymidine kinase-deficient LP1 1 cells ( e.g. , LP1 1- 

15P 1C5/7 cell line) established by single-cell cloning of LP1 1 cells were 

10 transfected with a thymidine kinase gene construct. Stable TK^ 

transfectants were established. 

EXAMPLE 10 - 

Isolation of SATACS and other chromosomes with atypical base content 
and/or size 

15 I. Isolation of artificial chromosomes from endogenous 
chromosomes 

Artificial chromosomes, such as SATACs, may be sorted from 
endogenous chromosomes using any suitable procedures, and typically 
involve isolating metaphase chromosomes, distinguishing the artificial 

20 chromosomes from the endogenous chromosomes, and separating the 
artificial chromosomes from endogenous chromosomes. Such 
procedures will generally include the following basic steps: (1) culture of 
a sufficient number of cells (typically about 2 x 10^ mitotic cells) to yield, 
preferably on the order of 1x10® artificial chromosomes, (2) arrest of 

25 the cell cycle of the cells in a stage of mitosis, preferrably metaphase, 
using a mitotic arrest agent such as colchicine, (3) treatment of the cells, 
particularly by swelling of the cells in hypotonic buffer, to increase 
susceptibility of the cells to disruption, (4) by application of physical 
force to disrupt the c lis in the presence of isolation buff rs for 

30 stabilization of the released chromosomes, (5) disp rsal of chromosomes 
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in the presence of isolation buffers for stabilization of free ■chromosomes, 
(6) separation of artificial from endogenous chromosomes and (7) storage 
(and shipping if desired) of the isolated artificial chfomosomes ih 
appropriate buffers.- Modifications-and variations of the general-— - — 
5 procedure for isolation of artificial chromosomes, for example to 

accommodate different cell types with differing growth characteristics 
and requirements and to optimise the duration of mitotic block with 
arresting agents to obtain the desired balance of chromosome yield and 
level of debris, may be empirically determined. 
10 Steps 1-5 relate to isolation of metaphase chromosomes. The 

separation of artificial from endogenous chromosomes (step 6) may be 
accomplished in a variety of ways. For example, the chromosomes may 
be stained with DNA-specific dyes such as Hoeschst 33258 and 
chromomycin A3 and sorted into artificial and endogenous chromosomes 
15 on the basis of dye content by employing fluorescence-activated cell 
sorting (FACS). To facilitate larger scale isolation of the artificial 
chromosomes, different separation techiniques may be employed such as 
swinging bucket centrifugation (to effect separation based on 

chromosome size and density) (see, e.g., Mendelsohn etaL (1968) J, 
20 MsLBioL 32:101-108], zonal rotor centrifugation (to effect separation on 
the basis of chromosome size and density) [see, e.g., Burki et aL (1973) 
PrSB^Biochem, 3:157-182; Stubblefield et ah (1978) Biochem. Binphy^ 
Res, Commun. 83: 1 404-1 414, velocity sedimentation (to effect 
separation on the basis of chromosome size and shape) Isee e.g., Collard 
25 et aL (1 984) Cytometry 5:9-19]. Immuno-affinity purification may also 
be employed in larger scale artificial chromosome isolation procedures. 
In this process, large populations of artificial chromosome-containing 
cells (asynchronous or mitotically enriched) are harvested en masse and 
the mitotic chromosomes (which can be released from the cells using 
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standard procedures such as by incubation of the cells in hypotonic 
buffer and/or detergent treatment of the cells in conjunction with 
physical disruption of the treated cells) are enriched by binding to 
antibodies that are bound to solid state matrices (e.g. column resins or 
5 magnetic beads). Antibodies suitable for use in this procedure bind to 
condensed centromeric proteins or condensed and DNA-bound histone 
proteins. For example, autoantibody LU851 (see Hadlaczky et aL (1 989) 
Chromosoma 97 :282-288)> which recognizes mammalian centromeres 
may be used for large-scale isolation of chromosomes prior to 

10 subsequent separation of artificial from endogenous chromosomes using 
methods such as FACS. The bound chromosomes would be washed and 
eventually eluted for sorting. Immunoaffinity purification may also be 
used directly to separate artificial chromosomes from endogenous 
chromosomes. For example, SATACs may be generated in or transferred 

15 to (e.g., by microinjection or microcell fusion as described herein) a cell 
line that has chromosomes that contain relatively small amounts of , 
heterochromatin, such as hamster cells (e.g., V79 cells or CHO-Kl cells). 
The SATACs, which are predominantly heterochromatin, are then separa- 
ted from the endogenous chromosomes by utilizing anti-heterochromatin 

20 binding protein (Drosophila HP- 1) antibody conjugated to a solid matrix. 
Such matrix preferentially binds SATACs relative to hamster ^ 
chromosomes. Unbound hamster chromosomes are washed away from 
the matrix and the SATACs are eluted by standard techniques. . 
A. Ceil lines and ceil cuituring procedures 

25 In one isolation procedure, 1B3 mouse-hamster-human hybrid cells 

(see. Figure 4] carrying the megachromosome or the truncated 
megachromosome were grown in F-12 medium supplemented yyith 10% 
fetal calf serum, 150 fjg/m\ hygromycin B and 400 //g/ml G418. GHB42 
(a cell line recloned frorn,G3D5 cells] mouse-hamster hybrid cells carrying 




the megach^ also cultured in F- 

12 medium containing 10% fetal calf serum, 150 ^g/ml hygromycin B 
/ and 400 //g/ml G41 8. The doubling time of both celUines w^s about 24- 

- _ - 40 hours,- typically about^32-hours~^^ - - - — - — — - - - - - 

5 Typically, cell monolayers are passaged when they reach about 

60-80% confluence and are split every 48-72 hours. Cells that reach 
greater than 80% confluence .senesce in culture and are not preferred for 
chromosome harvesting. Cells may be plated in 100-200 100-mm dishes 
at about 50-70% confluency 1 2-30 hours before mitotic arrest (see, 
10 below). * 

Other cell lines that may be used as hosts for artificial chromo- 
somes and from which the artificial chromosomes may be isolated in- 
clude, but are not limited to, PtKI (NBL-3) marsupial kidney cells (ATCC 
accession no. CCL35), CH0-K1 Chinese hamster ovary cells (ATCC acr 
15 cession no. CCL61 ), V79-4 Chinese hamster lung cells (ATCC accession 
no. CCL93), Indian muntjac skin cells (ATCC accession no. CCL1 57), 
LMTK(-) thymidine kinase deficient murine L cells (ATCC accession no. 
^^^^ "^^'^^^ ^^^y^^^^ (Spoidoptera frugiperda) ovary cells (ATCC 
accession no. CRL 171 1) and any generated heterokaryon (hybrid) cell 
20 lines, such as, for example, the hamster-murine hybrid cells described 
herein, that may be used to construct MACs, particularly SATACs. 

Cell lines may be selected, for example, to enhance efficiency of 
artificial chromosome production and isolation as may be desired in large- 
scale production processes. For instance, one consideration in selecting 
25 host cells may be the artificial chromosonrie-to-total chromosome ratio of 
the cells. To facilitate separation of artificial chromosomes from 
endogenous chromosomes, a higher artificial chrombsome-to-total 
_ --chromosome ratio might be desirable. For exampl ; for HI D3 cells (a 
murine/hamster heterokaryon; see Figure 4), this ratio is 1 :50, i.e., one 
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artificial chromosome (the megachromosome) to 50 total chromosomes. 
In contrast, Indian muntjac skin cells (ATCC accession no; CCL157) 
contain a smaller total number of chromosomes (a diploid number of 
chromosomes of 7), as do kangaroo rat cells (a diploid number of 
5 chromosomes of 12) which would provide for a higher artificial 
chromosome-to-total chromosome ratio upon introduction of> or 
generation of, artificial chromosomes in the cells. 

Another consideration in selecting host cells for production and 
isolation of artificial chromosomes may be size of the endogenous 

10 chromosomes as compared to that of the artificial chromosomes.^ Size 
differences of the chromosomes may be exploited to facilitate separation 
of artificial chromosomes from endogenous chromosomes. For example, 
because Indian muntjac skin cell chromosomes are considerably larger 
than minichromosomes and truncated megachromosomes, separation of 

15 the artificial chromosome from the muntjac chromosomes may possibly 
be accomplished using univariate (one dye, either Hoechst 33258 or 
Chromomycin A3) FACS separation procedures. 

Another consideration in selecting host cells for production and - 
isolation of artificial chromosomes may be the doubling time of the cells. 

20 For example, the amount of time required to generate a sufficient numb i 
of artificial chromosome-containing cells for use in procedures to isolate 
artificial chromosomes may be of significance for large-scale production. 
Thus, host cells with shorter doubling times may be desirable. For in- 
stance, the doubling time of V79 hamster lung cells is about 9-10 hours 

25 in comparison to the approximately 32-hour doubling time of H1D3 cells. 

Accordingly, several considerations may go into the selection of 
host cells for the production, and isolation of artificial chromosomes. It 
may b that the host cell selected as the most desirable for de novo 
formation of artificial chromosomes is not optimized for large-scale 
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production of the artificial chromosomes generated in the cell line. In 
such easesV it may be possible, once the artificial chromosome has been 
generated in the initial host cell line, to transfer it to a production cell line 
more well suited to efficientr high-level production and isolation of the^ ^ 
artificial chromosome. Such transfer may be accomplished through 
several methods, for example through microcell fusion, as described 
herein, or microinjection into the production cell line of artificial 
chromosomes purified from the generating cell line using procedures such 
as described herein. Production cell lines preferably contain two or more 
copies of the artificial artificial chromosome per cell. 
B. Chromosome isolation 
In general, cells are typically cultured for two generations at 
exponential growth prior to mitotic arrest. To accumulate mitotic 1 B3 
and GHB42 cells in one particular isolation procedure, 5 //g/ml colchicine 
was added for 1 2 hours to the cultures. The mitotic index obtained was 
60-80%. The mitotic cells were harvested by selective detachment by 
gentle pipetting of the medium on the monolayer cells. It is also possible 
to utilize mechanical shake-^off as a means of releasing the rounded-up 
(mitotic) cells from the plate. The cells were sedimented by 
centrifugation at 200 X g for 10 minutes. 

Cells (grown on plastic or in suspension) may be arrested in 
different stages of the cell cycle with chemicar agents other than 
colchicine, such as hydroxyurea, vinblastine, colcemid or aphidicolin. 
Chemical agents that arrest the cells in stages other than mitosis, such 
as hydroxyurea and aphidicolin, are used to synchronize the cycles of all 
cells in the population and then are removed from the cell medium to 
allow the cells to proceed, more or less simultaneously, to mitosis at 
which time they may be harvested to disperse the chromdsomes: Mitotic 
cells could be enriched for a mechanical shak -off (adherent cells). The 
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cell cycles of cells within a population of MAC-containing cells may also 
be synchronized by nutrient, growth factor or hormone deprivation which 
leads to an accumulation of cells in the or Gq stage; readdition of 
nutrients or growth factors then allows the quiescent cells to re-enter the 
5 the cell cycle in synchrony for abot one generation. Cell lines that are 
known to respond to hormone deprivation in this manner, and which are 
suitable as hosts for artificial chromosomes, include the Nb2 rat 
lymphoma cell line which is absolutely dependent on prolactin for 
stimulation of proliferation (see Gout et aL (1 980) Cancer Res. 40:2433- 

10 2436). Culturing the cells in prolactin-deficient medium for 18-24 hours 
leads to arrest of proliferation, with cells accumulating early in the G, 
phase of the cell cycle. Upon addition of prolactin, all the cells progress 
through the cell cycle until M phase at which point greater than 90% of 
the cells would be in mitosis (addition of colchicine could increase the 

15 amount of the mitotic cells to greater than 95%). The time between 
reestablishing proliferation by prolactin addition and harvesting mjtotic 
cells for chromosome separation may be empirically determined. 

Alternatively, adherent cells, such as V79 cells, may be grown in 
roller bottles and mitotic cells released from the plastic surface by 

20 rotating the roller bottles at 200 rpm or greater (Shwarchuk et aL (1 993) 
int. Radiat. Biol. 64 :601-61 2). At any given time, approximately 1% 
of the cells in an exponentially growing asynchronous population is in M- 
phase. Even without the addition of colchicine, 2 x 10^ mitotic cells 
have been harvested from four 1750-cm^ roller bottles after a 5-min spin 

25 at 200 rpm. Addition of colchicine for 2 hours may increase the yield to 
6 X 10® mitotic cells. 

Several procedures may be used to isolate metaphase 
chromosomes from these cells, including, but not limited to, one based 
on a polyarnine buffer system [Cram et aL M 990) Methods in Cell 
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Bioloay 33:377-382]; one on a modified hexylene glycol buffer system 
(Hadlaczky et aL (1982) Chromosoma 86:643-65]/ one on a magnesium 
sulfate buffer system [Van den Engh et aL (1 988) Cytometry 9:266-270 

- ----and-Van-den-Engh-M air(r984)-Cvt-ornetrv 
5 fixation buffer system [Stoehr et aL (1982) Histochemistry 7A R7-fii v 
and one on a technique utilizing hypotonic KCI and propidium iodide 
[Cram et ah (1 994) XVII meeting of the International Society for 
Analytical Cytology, October 16-21, Tutorial IV Chromosome Analysis 
and Sorting with Commeri cal Flow Cvto mete rs : r ram gt al^ (iQQr>) 

10 Methods rn Cell Biology .•^.•^••^Tfi] 

1. Polyamine procedure 

In the polyamine procedure that was used in isolating artificial 
chromosomes from either 1 B3 or GHB42 cells, about 10^ mitotic cells 
were incubated in 10 ml hypotonic buffer (75 mM KCI, 0.2 mM 
15 spermine, 0:5 mM spermidine) for 10 minutes at room temperature to 
swell the cells; The oells are swollen in hypotonic buffer to loosen the 
metaphase chromosomes but not to the point of cell lysis. The cells 
were then centrifuged at 100 x g for 8 minutes, typically at room 
temperature. The cell pellet was drained carefully and about 10^ cells 
20 were resuspended in 1 ml polyamine buffer [15 mM Tris-HCI, 20 mM 
NaCI, 80 mM KCI, 2 mM EDTA, 0.5 mM EGTA, 14 mM ^ff-mercapto- 
ethanol, 0.1% digltonin, 0.2 hmM Spermine, 0.5 mM' spermidine] for 
physical dispersal of the metaphase chromosomes. Chromosomes wer 
then released by gently drawing the cell suspension up and expelling it 
25 through a 22 G needle attached to a 3 ml plastic syringe. The 

chromosome concentration was about 1-3 x 10^ chromosomes/ml. 

The polyamine buffer Isolation protocol is well suited for obtaining 
" high molecular w 

Histochem. Cvtochem. ?9!74-7R: Vanrmia ^ ^ p opft^ Biotechnology 
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4:537-552; Bartholdi et aL (1 988) In "Molecular Genetics of Mammalian 
Cells" (M.Goettsman, ed.)> Methods in EnzvmoloQv 1 51 :252-267. 
Academic Press, Orlando]. The chromosome stabilizing buffer uses the 
polyamines spermine and spermidine to stabilize chromosome structure 
5 [Blumenthal et aL (1979)J. Cell Biol. 81 :255-259: Lalande et aL (1 985) 
Cancer Genet. Cvtoaenet. 23 : 151-1 57] and heavy metals chelators to 
reduce nuclease activity. 

The polyamine buffer protocol has wide applicability, however, as 
with other protocols, the following variables must be optimized for each 
10 cell type: blocking time, cell concentration, type of hypotonic swelling 
buffer, swelling time, volume of hypotonic buffer, and vortexing time. 
Chromosomes prepared using this protocol are typically highly 
condensed. 

There are several hypotonic buffers that may be used to swell the 
15 cells, for example buffers such as the following: 75 mM KCI; 75 mM KCI, 
0.2 mM spermine, 0.5 mM spermidine; Ohnuki's buffer of 16.2 nriM 
sodium nitrate, 6.5 mM sodium acetate, 32.4 mM KCI [Ohnuki (1965) 
Nature 208:916-917 and Ohnuki (1968) Chromosoma 25 :402-4281: and 
a variation of Ohnuki's buffer that additionally contains 0.2 mM spermine 
20 and 0.5 mM spermidine. The amount and hypotonicity of added buffer 
vary depending on cell type and cell concentration. Amounts may range 
from 2.5 - 5-5 ml per 10^ cells or more. Swelling times may vary from 
10-90 minutes depending on cell type and which swelling buffer is used. 
The composition of the polyamine isolation buffer may also be 
25 varied. For example, one modified buffer contains 15 mM Tris-HCl, pM 
7.2, 70 mM NaCI, 80 mM KCI, 2 mM EDTA, 0.5 mM EGTA, 14 mM 
beta-mercaptoethanol, 0.25% Triton-X, 0.2 mM spermine and 0.5 mM 
spermidine. 
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Chrombsomai dispersal may also be accompHshed bv^ a variety bf 
physical means. For example, cell suspension may be gently drawn up 
: and expelled in a 3-ml syringe fitted with a 22-gauge needle [Cram ej aL 
— - -( t990)-Methods-in-Gell Riolnqy 33:377-382]-cell -susperisibn^may be " 
5 agitated on a bench-top vortex [Cram et aL (1 990) Methods in rail 
BiQioay 33:377-382], cell suspension may be disrupted with a 
homogenizer [Sillar and Young (1981 ) J. Histochem. Cvtochem 79-7/l1 
78; Carrano et aL (1 979) Proc. NatL Acad, ScL U.S.A. 76: 1 382-1 384] 
and cell suspension may be disrupted with a bench-top ultrasonic bath 
10 [Stoehr et aL (1982) Histochemistry 7A-R7-fii] 

2. Hexylene glycol buffer system : : ; 
In the hexylene glycol buffer procedure that was used in isolating 
artificial chromosomes from either 1 83 or GHB42 cells, about 8 x 10^ 
mitotic cells were resuspended in 10 ml glycine-hexylene glycol buffer 
15 [100 mM glycine, 1 % hexylene glycol, pH 8.4-8.6 adjusted with 

saturated Ca-hydroxide solution] and Incubated for 10 minutes at 37 °C, 
followed by centrifugation for 10 minutes to pellet the nuclei. The 
supernatant was centrifuged again at 200 x g for 20 minutes to pellet 
the chromosomes. Chromosomes were resuspended in isolation buffer 
20 ( 1 r3x 1 0^ chromosomes/ml) . 

The hexylene glycol buffer composition may also be modified. For 
example, one modified buffer contains 25 mM Tris-HCI, pH 7.2, 750 mM 
hexylene glycol, 0.5 mM CaCI^, 1 .0 mM MgCI^ [Carrano et aL (1 979) 
Pt^c^ NmL Acad, ScL LLS^ 76:1 

3- Magnesium-sulfate buffer system 
This buffer system may be used with any of the methods of cell 
swelling and chromosomal dispersal, such as described abov in 
°°""®ction with the polyamine and hexylene glycol buffer systems. In ~ 
this procedure, mitotic cells are resuspended in the following buffer: 4.8 
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mM HEPES, pH 8.0, 9.8 mM MgSO^, 48 mM KCI, 2.9 mM dithiothreitol 
[Van den Engh et al. (1985) Cytometry 6:92 and Van den Engh et aL 
(1984) Cytometry 5:1081. 

4. Acetic acid fixation buffer system 

This buffer system may be used with any of the methods of cell 
swelling and chromosomal dispersal, such as described above in 
connection with the polyamine and hexylene glycol buffer systems. In 
this procedure, mitotic cells are resuspended in the following buffer: 25 
mM Tris-HCI, pH 3.2, 750 mM (1 ,6)-hexandiol, 0.5 mM CaCl2, 1 .0% 
acetic acid [Stoehr et aL (1982) Histochemistry 74 :57-61 1, 

5. KCI-propidium iodide buffer system 

This buffer system may be used with any of the methods of cell 
swelling and chromosomal dispersal, such as described above in 
connection with the polyamine and hexylene glycol buffer systems. In 
this procedure, mitotic cells are resuspended in the following buffer: 25 
mM KCI, 50//g/ml propidium iodide, 0.33% Triton X- TOO, 333 //g/ml 
RNase [Cram et aL (1990) Methods in Cell Biology 33 :3761. 

The fluorescent dye propidium iodide is used and also serves as a 
chromosome stabilizing agent. Swelling of the cells in the hypotonic 
medium (which may also contain propidium iodide) may be monitored by 
placing a small drop of the suspension on a microscope slide and 
observing the cells by phase/fluorescent microscopy.. The cells should 
exclude the propidium iodide while swelling, but some rnay lyse 
prematurely and show chromosome fluorescence. After the cells have 
been centrifuged and resuspended in the KCI-propidium iodide buffer 
system, they will be lysed due to the presence of the detergent in the 
buffer. The chromosomes; may then be dispersed and then incubated at 
37 °C for up to 30 minutes to permit the RNase to act. The chromosome 
preparation is then analyzed by flow cytometry. The propidium iodide 
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: fluorescence can be excited at the 488 nm wavelength of an argon laser 
• and detected through an OG 570 optical filter by a single photomultiplier 

.. . tube. The single pulse may be integrated and acquired in an univariate 

histogram-.- The-flow cytometer -may be-alighid toVc 
5 using small (1.5 //m diameter) microspheres. The chromosome 
_ vP filtered through 60 //m nylon mesh before analysis. 

^" t °* chromosomes with DNA-specific dyes 

, Subsequent to isolation/the chromosome preparation was stained 

: wi^h Hpechst 33258 at 6/yg/ml and chromomycin A3 at 200//g/ml. 
10 Fifteen minutes prior to analysis, 25 mM Na-sulphite and 10 mM Na- 
citrate were added to the chromosome suspension. 
D; Flow sorting of chromosomes 

: - 1B3 and GHB42 cells and maintained 

- were suspended in a polyamine-based sheath buffer (0.5 mM EGTA 2 0 
15 mM EDtA, 80 mM KCI, 70 mM NaCI. 15 mM Tris-HCI, pH 7.2, 0.2 mM 

\ spermine and 0.5 mM spermidine) [Sillar and Young (1981) J, 

Histociiem.^C^^ then passed 

. - . through a dual^laser cell sorter [FACStar Plus or FAXStar Vantage Becton 

- Dickinson Immunocytometry System; other dual-laser sorters may also be 
20 used, such as those manufactured by Coulter Electronics (Elite ESP) and 

Cytomation (MoFlo)] in which two lasers were set to excite the dyes 
separately, allowing a bivariate analysis of the chromosome by size and 
base-pair composition. Because of the difference between the base 
composition of the SATACs and the other chromosomes and the 
25 resulting difference in interaction with the dyes, as well as size 

differences, the SATACs were separated from the other chromosomes. 
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E. Storage of the sorted artificial chromosomes 
Sorted chromosomes may be pelleted by centrifugation and 
resuspended in a variety of buffers, and stored at 4°C. For example, the 
isolated artificial chromosomes may be stored in GH buffer (100 mM 
5 glycine, 1 % hexylene glycol pH 8.4—8.6 adjusted with saturated Ca- 
hydroxide solution) [see, e.g. , Hadlaczky et al. (1982) Chromosoma 
86:643-659] for one day and embedded by centrifugation into agarose. 
The sorted chromosomes were centrifuged into an agarose bed and the 
plugs are stored in 500 mM EDTA at 4° G, Additional storage buffers 

10 include CMB-l/polyamine buffer (17.5 mM Tris-HCI, pH 7.4, 1.1 mM 
EDTA, 50 mM epsilon-amino caproic acid, 5 mM benzamide-HCI, 0.40 
mM spermine, 1.0 mM spermidine, 0.25 mM EGTA, 40 mM KCI, 35 mM 
NaCI) and CMB-ll/polyamine buffer (100 mM glycine, pH 7.5, 78 mM 
hexylene glycol, 0.1 mM EDTA, 50 mM epsilon-amino caproic acid, 5 

15 mM benzamide-HCI, 0.40 mM spermine, 1.0 mM spermidine, 0.25 mM 
EGTA, 40 mM KCI, 35 mM NaCI). ^ 

When microinjection is the intended use, the sorted chromosomes 
are stored in 30% glycerol at -20'' C. Sorted chromosomes may also be 
stored without glycerol for short periods of time (3-6 days) in storage 

20 buffers at 4°C. Exemplary buffers for microinjection include CBM-l (10 
mM Tris-HCI, pH 7.5, 0.1 mM EDTA, 50 mM epsilon-amino caproic acid, 
5 mM benzamide-HCI, 0.30 mM spermine, 0.75 mM spermidine), CBM-II 
(100 mM glycine, pH 7.5, 78 mM hexylene glycol, 0.1 mM EDTA, 50 
mM epsilon-amino caproic acid, 5 mM benzamide-HCI, 0.30 mM 

25 spermine, 0.75 mM spermidine). 

For long-term storage of sorted chromosomes, the above buffers 
are preferably supplemented with 50% glycerol and stored at -20°C. 
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F. Quality control 

• Analysis of the purity ; 

The purity of the sorted chromosomes Was checked by * ■ ^ 

„ ,^.„. fluorescence-,>7 5,Y^y hybridization (FISH)^^ w 

5 satellite DNA probe [see, Hadlaczky et aL (1991) Proc. N.tr a..w c,.; 

iLS^ 88:8106-81 10]. Purity of the isolated chromosomes was about 
■ • 97-99%. . 

2. Characteristics of the sorted chromosomes 

Pulsed field gel electrophoresis and Southern hybridization were 
10 carried out to determine the size distribution of the DNA content of the : 
sorted artificial chromosomes. 

G. Functioning of the purified artificial chromosomes 
r To check whether their activity is preserved, the purified artificial 
Chromosomes may be microinjected (using methods such as those 
15 described in Example 13) into primary cells, somatic cells and stem cells 
which are then analyzed for expression of the heterologous genes carri d 
by the artificial chromosomes, e.g., such as analysis for growth on 
selective medium and assays of ;ff-galactosidase activity. 
II. Sorting of mammalian artificial chromosome-containing microcells 

A. Micronucleation 

Cells were grown to 80-90% confluency in 4 T1 50 flasks. 
Golcemid was added to a finer concentration of 0.06 //g/ml, and then 
Incubated with the cells at Syc for 24 hours. 

B. Enucleation 

25 : Ten //g/ml cytochalasin B was added and the resulting microcells 
were centrifuged at 15,000 rpm for 70 minutes at 28-33° C. 

C. P""fication of microcells by filtration 

~ "^^^ "^'^'■°ce'ls-were purified usihg^Swinnex^f^^^^^^^^^ 
Nucleopore filters [5 //m and 3 /ymj. 
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D. Staining and sorting microcelis 

As above, the cells were stained with Hoechst and chromonnycin 
A3 dyes. The microcelis were sorted by cell sorter to isolate the 
microcelis that contain the mammalian artificial chromosomes. 
5 E, Fusion 

The microcelis that contain the artificial chromosome are fused, for 
example, as described in Example I.A.5., to selected primary cells, 
somatic cells, embryonic stem cells to generate transgenic (non-human) 
animals and for gene therapy purposes, and to other cells to deliver the 
10 chromosomes to the cells. 

EXAMPLE 11 

Introduction of mammalian artificial chromosomes into insect cells 

Insect cells are useful hosts for MACs, particularly for ijse in the 
production of gene products/ for a number of reasons, including: 
15 1. A mammalian artificial chromosome provides an extra- 

genomic specific integration site for introduction of genes encoding 
proteins of interest [reduced chance of mutation in production system]. 

2. The large size of an artificial chromosome permits megabase 
size DNA integration so that genes encoding an entire pathway leading to 

20 a protein or nonprotein of therapeutic value, such as an alkaloid [digitalis, 
morphine, taxoll can be accomodated by the artificial chromosome. 

3. Amplification of genes encoding useful proteins can be 
accomplished in the artificial mammalian chromosome to obtain higher 
protein yields in insect cells. 

25 4. Insect cells support required post-translational modifications 

(glycosylation, phosphorylation) essential for protein biological function. 

5. Insect cells do not support mammalian viruses — eliminates 
cross-contamination of product with human infectious agents. 
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" 6. The ability to introduce chromosomes circumvents 

tradltiohar recombinant baculdvirus systems for pr^^^ of nutritional, 
industn 

5 permits reduced energy cost of production. 

8. Serum free growth medium for insect cells will result in 
lower production costs. 

9. Artificial chromosome-containing cells can be stored 
indefinitely at low temperature. 

"•^ 10- '"sect larvae will serve as biological factories for the 

production of nutritional, medicinal or industrial proteins by microinjection 
of fertilized insect eggs. 

A. Demonstration that Insect cells recognize mammalian promoters 

Gene constructs containir^gj^^Timallan promoter, such as the 
15 CMV promoter, linked to a detectable marker gene [/?em7/a luciferase 
gene(see,e^, U.S: Patent No. 5,292,658 for a description of DNA 
encoding the Renilla luciferase, arid plasmid pTZrLucl^ which can 
provide the starting material for construction of such vectors, see also 
SEQ ID No. 10] and also including the simian virus 40 (SV40) promoter j 
20 operably linked to the ^galactosidase gene were introduced into the cells i 
of two species Trichoplusia ni [cabbage looper] and Bombyx mori [silk j 
worm]. 

After transferring the constructs Into the insect cell lines either by 
electroporation or by microinjection, expression of the marker genes was 
25 detected In luciferase assays (see e^. Example 12.C.3) and In yff- 

galactosidase assays (such as lacZ staining assays) after a 24-h 
incubation. In each case a positive result was obtained in the samples 

were omitted. In addition, a B. mori fi-aoXm promoter-/?e/7/7/a luciferase 
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gene fusion was introduced into the 7. /?/ and fl. /nor/ cells which yielded 
light emission after transfection. Thus, certain nriammahan promoters 
function to direct expression of these marker genes in insect cells. 
Therefore, MACs are candidates for expression of heterologous genes in 
5 insect cells. : 

B. Construction of vectors for use in insect cells and fusion with 
mammalian cells 

1. Transform LMTK' cells with expression vector with: 

a. Rv77or/;ff-actin promoter— Hyg' selectable marker 
10 gene for insect cells, and 

b. SV40 or CMV promoters controlling a puromycin^ 
selectable marker gene for mammalian cells. 

2. Detect expression of the mammalian promoter in LMTK cells 
(puromycin' LMTK cells) 

15 3. Use puromycin' cells in fusion experiments with Bombyx and 

Trichoplusia cells, select Hyg*^ cells. 

C. Insertion of the MACs into insect cells 

These experiments are designed to detect expression of a 
detectable marker gene [such as the ^ff-galactosidase gene expressed 

20 under the control of a mammalian promoter, such as pSV40 ] located on 
a MAC that has been introduced into an insect cell. Data indicate that p- 
gal was expressed. 

Insect cells are fused with mammalian cells containing mammalian 
artificial chromosomes, e.g. , the minichromosome [EC3/7C5] or the mini 

25 and the megachromosome [such as GHB42, which is a cell line recloned 
from G3D5] or a cell line that carries only the megachromosome [such as 
H1D3 or a redone therefrom]. Fusion is carried out as follows: 
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1 



mammalian + insect cells (50/50%) in log phase growth are 
mixed; 

-^-^^^ 2. calcium/PEG cell fusion: (10 min ^ 0.5 h); ' 

3, - ^ heterokaryons ( + 72 h) are selected r '—r- — — - - - - - 
5 The following selection conditions to select for insect cells that 

contain a MAC can be used: ] + = positive selection; - = negativ 
selection]: 

1. growth at 28° C ( + Insect cells. - mammalian cells); 

2. Graces insect cell medium [SIGMA] (- mammalian cells); 
10 3. no exogenous CO2 mammalian cells); and/or 

4. antibiotic selection (Hyg or G418) (+ transformed insect cells). 
Immediately following the fusion protocol, many heterokaryons 

[fusion events] are observed between the mammalian and each species 
of insect cells [up to 90% heterokaryons]. After growth [2+ weeks] on 
insect medium containing G418 and/or hygromycin at selection levels 
used for selection of transformed mammalian cells, individual colonies are 
detected growing on the fusion plates. By virtue of selection for the 
antibiotic resistance conferred by the MAC and selection for insect cells, 
these colonies should contain MACs. 

20 The R /770/7^-actin gene promoter has been shown to direct 

expression of the ^-galactosidase gene in B. mor/ cells and mammalian 
cells (e^fl,; EC3/7G5 cells). The B. mori fi-actin gene promoter is, thus, 
particularly useful for inclusion in MACs generated in mammalian cells 
that will subsequently be transferred Into insect cells because the 

25 presence of any marker gene linked to the promoter can be determined in 
the mammalian and resulting insect cell lines. 



15 
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EXAMPLE 12 

Preparation of chromosome fragmentation vectors and other vectors for 
targeted integration of DNA into MACs 

Fragmentation of the megachromosome should ultimately result in 

5 smaller stable chromosomes that contain about 15 Mb to 50 Mb that will 

be easily manipulated for use as vectors. Vectors to effect such 

fragmentation should also aid in determination and identification of the 

elements required for preparation of an in vitro -produced artificial 

chromosome. 

10 . Reduction in the size of the megachromosome can be achieved in 

a number of different ways including: stress treatment, such as by 
starvation, or cold or heat treatment; treatment with agents that 
destabilize the genome or nick DNA, such as BrdU, coumarin, EMS and 
others; treatment with ionizing radiation [see, e.g. . Brown (1992) Curr. 

1 5 Ooin. G enes Dev. 2:479-486]; and telomere-directed in vivo chromosome 

fragmentation [see, e.a. , Farr et al. (1995) EMBO J. 14 :5444-54541. 

A. Preparation of vectors for fragmentation of the artificial 
chromosome and also for targeted integration of selected 
gene products ^ 

20 1 . Construction of pTEMPUD 

Plasmid pTEMPUD [see Figure 5] is a mouse homologous 
recombination "killer" vector for jn vivo chromosome fragmentation, and 
also for inducing large-scale amplification via site-specific integration. 
With reference to Figure 5, the -3,625-bp Sall-PstI fragment was 

25 derived from the pBabe-puro retroviral vector [see, Morgenstern et al. 

(1990) Nucleic Acids Res. 18:3587-35961. This fragment contains DNA 
encoding ampicillin resistance, the. pUC origin of replication, and the 
puromycin N-acetyl transferase gene under control of the SV40 early 
promoter. The URA3 gene portion comes from the pYAC5 cloning vector 

30 [SIGMA]. URA3 was cut out of pYAC5 with Sall- Xho l digestion, cloned 
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/-^ into PNEB193 [N^^^ BiolabsL which was then cut with EcoRI- " - 

Sail and ligated to the Sail site of pBabepuro to produce pPU. 

A 1 293-bp fragment (see SEQ ID No. 1] encoding the mouse major 
. satellite,_was isolated-as-an EcoRI fragment from a DNA-Iibfary-produc^d^^ 
5 from mouse LMTK fibroblast cells and inserted into the EcoRI site of pPU 
to produce pMPU. 

The TK promoter-driven diphtheria toxin gene [DT-A] was derived 
from pMC IDT- A [see, Maxwell et aL (1986) Cancer Res 4fi-Afifir>-zLfiftA] 
by Bgill-Xhol digestion and cloned into the pMCI neo poly A expression 
10 vector [STRATAGENE, La Jolla, CA] by replacing the neomycin- 

resistance gene coding sequence. The TK promoter, DT-A gene and poly 
A sequence were removed from this vector, cohesive ends were filled 
\«ith Klenow and the resulting fragment blunt end-ligated and ligated into 
the SnaBI (TACGTAJ of pMPU to produce pMPUD. 

15 The Hutel 2. 5-kb fragment [see SEQ ID No. 3] was inserted at the 

PstI site [see the 6100 PstI - 3625 PstI fragment on pTEMPUDl of 
pMPUD to produce pTEMPUD. This fragment includes a human 
telomere. It includes a unique Balll site [see nucleotides 1042-1047 of 
SEQ ID No.3], which will be used as a site for introduction of a synthetic 

20 telomere that includes multiple repeats [80] of TTAGGG with Bam HI and 
Bglll ends for insertion into the Bgill site which will then remain unique, 
since the BamHI overhang is compatible with the Bglll site: Ligation of a 
BamHI fragment to a Bgill destroys the Bgill site, so that only a single 
Bgill site will remain. Selection for the unique Bgill site insures that the 

25 synthetic telomere will be inserted in the correct orientation. The unique 
Bgill site is the site at which the vector is linearized. 
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To generate a synthetic telomere made up of multiple repeats of 
the sequence TTAGGG, attempts were made to clone or amplify ligation 
products of 30-mer oligonucleotides containing repeats of the sequence. 
Two 30-mer oligonucleotides, one containing four repeats of TTAGGG 
5 bounded on each end of the complete run of repeats by half of a repeat 
and the other containing five repeats of the complement AATCCC, were 
annealed. The resulting double-standed molecule with 3-bp protruding 
ends, each representing half of a repeat, was expected to ligate with 
itself to yield concatamers of n x 30 bp. However, this approach was 

10 unsuccessful, likely due to formation of quadruplex DNA from the G-rich 
strand. Similar difficulty has been encountered in attempts to generate 
long repeats of the pentameric human satellite II and III units. Thus, it 
appears that, in general, any oligomer sequence containing periodically 
spaced consecutive series of guanine nucleotides is likely to form . 

15 undesired quadruplex formation that hinders construction of long double- 
stranded DNAs containing the sequence, ''u 

Therefore, in another attempt to construct a synthetic telomere for 
insertion into the Bql ll site of pTEMPUD, the starting material was based 
on the complementary C-rtch repeat sequence {i.e., AATCCC) which 

20 .would not be susceptible to quadruplex structure formation. Two 

plasmids, designated pTEL280110 and pTel2801T1, were constructed as 
follows to serve as the starting materials. 

First, a long oligonucleotide containing 9 repeats of the sequence 
AATCCC (i.e., the complement of telomere sequence TTAGGG) in 

25 reverse order bounded on each end of the complete run of repeats by 
half of a repeat (therefore, in essence, containing 10 repeats), and 
recognition sites for Pst I and Pad restriction enzymes was synthesized 
using standard methods. The oligonucleotide sequence is as follows: 
5'-AAACTGCAGGTTAATTAACCCTAACCCTAACCCTAACCCTAACCCTAAC 
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GCTAACCCTAACCCTAACCCTAACCC(3GGAT-3' (SEQ ID NO. 29) 
A partially complementary short oligonucleotide of sequence 

. 3^TTGGGCCCTAGGCTTAAGG-5MSEQ ID NO. ^^^^^^^ 
J : was_ajsp synthesized.. The oligonucleotides were gel-purified, annealedr 
5 repaired with Klenow polymerase and digested with EcoRI and Pstl. The 
resulting EcoRI/PstI fragment was ligated with EcoRI/Pstl-digested 
pUCI 9. The resulting plasmid vyas used to transform coH DHSa 
competent cells and plasmid DNA (pTel102) from one of the 
transformants surviving selection on LB/ampicillin was digested with 
10 PacI, rendered blunt-ended by Klenow and dNTPs and digested with 
-^^^^ The resulting 2.7-kb fragment was gel-purified. 

Simultaneously, the same plasmid was amplified by the 
polymerase chain reaction using extended and more distal 26-mer Ml 3 
sequencing primers. The amplification product was digested with Srnal 
15 and Hindlll, the double-stranded 84-bp fragment containing the 60-bp 
telomeric repeat (plus 24 bp of linker sequence) was isolated on a 6% 
native polyacrylamide gel, and ligated with the double-digested pTel102 
to yield a 1 20-bp telomeric sequence. This plasmid was used to 
transform DHSa cells. Plasmid DNA from two of the resulting 
20 recombinants that survived selection on ampiciliin {TOO //g/ml) was 

sequenced on an ABI DNA sequencer using the dye-termination method. 
One of the plasmids, designated pTel29, contained a sequence of 20 
repeats of the sequence TTAGGG (i.e., 19 successive repeats of 
TTAGGG bounded on each end of the complete run of repeats with half 
25 of a repeat). The other plasmid, designated pTel28, had undergone a 
deletion of 2 bp (TA) at the junction where the two sequences, each 
containing, in essence, 10 repeats of the TTAGGG sequence, that had 
b en ligated to yi Id th plasmid. This resulted in a GGGTGGG motif at 
the Junction in pTel28. This mutation provides a us ful tag in telomere- 
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directed chromosome fragmentation experiments Therefore, the pTei29 
insert was amplified by PCR using pUC/M13 sequencing primers based 
on sequence somewhat longer and farther from the polylinker than usual 
as follows: 

5 5'-GCCAGGGTTTTCCCAGTCACGACGT-3' (SEQ ID NO. 31) 

or in some experiments 

5'-GCTGCAAGGCGATTAAGTTGGGTAAC-3' {SEQ ID NO. 32) 
as the nrt 13 forward primer, and . • 

5/-TATGTTGTGTGGAATTGTGAGCGGAT-3MSEQ ID N 

10 as the m13 reverse primer. .^^ ^^ 

The amplification product was digested with Sma l and Hin dlll. The ^U^'^ 
resulting 144-bp fragment was gel-purified on a 6% native 
polyacrylamide gel and ligated with pTel28 that had been digested with 
Pad , blunt-ended with Klenow and dNTP and then digested with Hindlll 

15 to remove linker. The ligation yjelded a plasmid designated pTel2801 .-.^ .^Hi l(X 

containing a telomeric sequence of 40 repeats of the sequence TTAGGG 
in which one of the repeats (i.e., the 30th repeat) lacked two nucleotides 
(TA), due to the deletion that had occurred in pTel28, to yield a repeat as 
follows: TGGG. ^ 

20 In the next extension step, pTel2801 was digested with, SnQa I and 

Hindlll and the 264-bp insert fragment was gel-purified and ligated with 
pTel2801 which had been digested with Pacl, blunt-ended and digested 
with Hin dlll. The resulting plasmid was transformed into DHSa^ cells and 
plasmid. DN A from 12 of the resulting transformantsf that survived 

25 selection on ampicillin was exanriined by restriction enzyme analysis for 
the presence of a 0.5-kb Eco RI/PstI insert fragment. Eleven of the 
recombinants contained the expected 0.5-kb insert. Thejnserts of two 
of the recombinants were sequenced and found to be as expected. 
These plasmids were designated pTel2801 10 and pTel2801 I T. Th se 
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i;: V ; : pl^mids, which are identical; boWcontain 80 repeats of the sequence 
: TTAGGG, in which two of the repeats (i.e., the 30th and 70th repeats) 
- lacked two nucleotides (TA); due to the deletior? that had occurred in ^ 

-P-Tel28> to yield a repeat as follows: TGGGr-Thusrirreach bf the clbnir^g" 

^ 5 steps (except the first), the length of the synthetic telcmiere doubled; th^^^ 
is, it was increasing in size exponentially. Its length was 60x2" bp, 
" is the number of extension cloning steps undertaken, 
r ■ ■ Therefore, In principle (assuming coll, or any other microbial host, e.g., 
y^^ tolerates long tandem repetitive DNA), it is possible to assemble 
10 any desirable size of safe telomeric repeats. 

In a further extension step, pTel2801 TO was digested with Pac L 
blunt-ended with Klenow polyrnerase In t^^^ 

; V '^'S®^^®^ w't*^ lindlli: the resulting 0:5-kb fragme was gel purified! 1 
Plasmid pTel280T11 was deaved with 

15 fragment was gel-purified and ligated to the 0.5-kb fragment from 

pTel2801 10. The resulting plasmid was used to transform DH5a cells. 
Plasmid DNA was purified from transformants surviving ampicillin 

^ selection. Nine of the selected recombinants were examined by 
restriction enzyme analysis for the presence of a 1 .0-kb EcoRI/PstI 

20 fragment. Four of the recombinants (designated pTlk2, pTlk6, pTlk7 and 
pTlk8) were thus found to contain the desired 960 bp telomere DNA 
insert sequence that included 160 repeats of the sequence TTAGGG in 
which four of the repeats lacked two nucleotides (TA), due to the 
deletion that had occurred in pTel28, to yield a repeat as follows: TGGG. 
25 Partial DNA sequence analysis of the EcoRI/PstI fragment of two of these 
plasmids (i.e., pTlk2 and pTlk6), in which approximately 300 bp from 
both ends of the fragment were elucidated, confirmed that the sequence 
-was-composed of successive repeats^of the TTAGG (3 sequ nee . 
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In order to add Prrrel and Bql ll sites to the synthetic telomere 
sequence, pTlk2 was digested with Pad and Pst I and the 3.7-kb 
fragment (i.e., 2.7-kb pUC19 and 1.0-kb repeat sequence) was gel- 
purified and ligated at the PstI cohesive end with the following 
5 oligonucleotide 5'-GGGTTTAAACAGATCTCTGCA-3' (SEQ ID NO. 34). 
The ligation product was subsequently repaired with Klenow polymerase 
and dNTP, ligated to itself and transformed into E^ coM strain DH5a. A 
total of 14 recombinants surviving selection on ampicillin were obtained. 
Plasmid DNA from each recombinant was able to be cleaved with Balll 

10 indicating that this added unique restriction site had been retained by 

each recombinant. Four of the 1 4 recombinants contained the complete 
1-kb synthetic telomere insert, whereas the insert of the remaining 10 
recombinants had undergone deletions of various lengths. The four 
plasmids in which the 1-kb synthetic telomere sequence remained intact 

15 were designated pTlkV2, pTlkV5, pTlkVS an pTlkV12. Each of these 
plasrhids could also be digested with Pme l: in addition the presence of 
both the Bql ll nad Pme l sites was verified by sequence analysis. Any of 
these four plasmids can be digested with Bam HI and Bgill to release a i 
fragment containing the 1-kb synthetic telomere sequence which is then 

20 ligated with Bqlll-digested pTEMPUD. 

2. Use of pTEMPUD for in vivo chromosome fragmentation 
Linearization of pTEMPUD by Bglll results in a linear molecule with 
a human telomere at one end. Integration of this linear fragment into the 
chromosome, such as the megachromosome in hybrid cells or any mouse 

25 chromosome which contains repeats of the mouse major satellite 
sequence results in integration of the selectable marker puromycin- 
resistance gene and cleavage of the plasmid by virtue of the telomeric 
end. The DT gene prevents that entire linear fragment from integrating 
by random events, since upon integration and expression it is toxic. 
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fc-0Thus rindom integration will toxic, si,e-dired,ed integration into the 
. . t^^geted DNA will be .elected. Such integration will prcduce ,ragn,ented 

^ ^ Chromosomes. • - 

5 ^"'Vive, and the other fragment withoutthe centromere will be lost 
Repeated in yiys fragmentations will ultimately result in selection of the 
smallest functioning artificial chromosome possible. Thus, this vector 
can be used to produce minichromcsomes from mouse chromosomes, or 
^ to fragment the megachromosome. In principle, this vector can be used 
10 to target any selected DNA sequence in any chromosome to achieve 
fragmentation. ,; ■ - 

3. Construction of pTERPUD 

A fragmentation/targeting vector analogous to pTEMPUD for in 

15 amplification via site-specific integration but which is based on mouse 
rDNA sequence instead of mouse major satellite DNA has been 
designated pTERPUD. In this vector, the mouse major satellite DNA 
sequence of pTEMPUD has been replaced with a 4770-bp BamHI 
fragment of megachrbmosome clone 161 which contains sequence 
20 corresponding to nucleotides 10,232-1 5,000 ln SEQ ID NO. 16. 
4. pHASPUD and pTEMPhu3 

Vectors that specifically target human chromosomes can be 
constructed from pTEMPUD. These vectors can be used to fragment 
spec.f.c human chromosomes, depending upon the selected satellite 
25 sequence, to produce human minichromosomes, and also to isolate 
human centromeres. 
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a. pHASPUD 

To render pTEMPUD suitable for fragmenting human 
chromosomes, the mouse major satellite sequence is replaced with 
human satellite sequences. Unlike mouse chromosomes, each human 
5 chromosome has a unique satellite sequence. For example, the mouse 
major satellite has been replaced with a human hexameric a-satellite [or 
alphoid satellite] DNA sequence. This sequence is an 813-bp fragment 
[nucleotide 232-1044 of SEQ ID No. 2] from clone pS12, deposited in 
the EMBL database under Accession number X6071 6, isolated from a 
10 human colon carcinoma cell line Colo320 [deposited under Accession No 
ATCC CCL 220.1]. The 813-bp alphoid fragment can be obtained from 
the pS12 clone by nucleic acid amplification using synthetic primers, 
each of which contains an Eco RI site, as follows: 

GGGGAATTCAT TGGGATGTTT CAGTTGA forward primer (SEQ ID No. 4] 
15 CGAAAGTCCCC CCTAGGAGAT CTTAAGGA reverse primer [SEQ ID No. 5]. 

Digestion of the amplified product with Eco RI results in a fragment 
with Eco RI ends that includes the human a-satellite sequence. This 
sequence is inserted into pTEMPUD in place of the Eco RI fragment that 
contains the mouse major satellite to yield pHASPUD. 
20 Vector pHASPUD was linearized with Bol l I and used to transform 

EJ30 (human fibroblast) cells by scrape loading. Twenty-seven 
puromycin-resistant transformant strains were obtained. 

b. pTEMPhuS 

In pTEMPhu3, the mouse major satellite sequence is replaced by 
25 the 3kb human chromosome 3-specific a-satellite from D3Z1 [deposited 
under ATCC Accession No. 85434; see, also Yrokov (1989) Cvtoaenet. 
Cell Genet. 51:11141. 
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. . ^ £acU human chromoson,e contains unique chromosome-specific 
: - — ?^quence.__Th,us, pTEMPHU3r which is-targettj u>^ - - " ' - 

chromosome 3-speci,ic a-sate„i,e, can be introduced into human cells 

under selective conditions. Whereby large-scale amplification Of the 
Chromosome 3 centromeric region and production of a </e novo 

Chromosome ensues. Such induced large-scale amplification provides a 
means for inducing ^« novo chromosome formation and also for in vivo 
clon,ng of defined human chromosome fragments up to megabasrsi^ 

, , ''"r^^'^P'^- ^e break-point in human chromosome 3 is on the 
Short arm near the centromere. This region is involved in renal cell 
carcnoma formation. By targeting pTEMPhu3 to this region, a,e induced 
^ large-scale amplification may contain this region, which can then be 
15 cloned using the bacterial and yeast markers in the pTEMPhu 3 vector 
- ■^''^ PTEMPhuS cloning vector allows not only selection for 
homologous recombinants, bu, also direct cloning of the integration site 
.n YACS. This vector can also be used to target human chromosome 3. 
preferably with a deleted short arm, in a mouse-human mono- 
20 Chromosomal microcell hybrid line. Homologous recombinants can be 

screened by nucleic acid amplification (PCR), and amplification can be 
screened by DNA hybridization. Southern hybridization, and 

hybndization. The amplified region can be cloned into a YAC This 
vector and these methods also permit a functional analysis of cloned 
chromosome regions by reintroducing the cloned amplified region into 

mammalian cells. 
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B. Preparation of libraries in YAC vectors for cloning of centromeres 
and identification of functional chromosomal units 

Another method that may be used to obtain smaller-sized 
functional mammalian artificial chromosome units and to clone 
5 centromeric DNA involves screening of mammalian DNA YAC vector- 
based libraries and functional analysis of potential positive clones in a 
transgenic mouse model system. A mammalian DNA library is prepared 
in a YAC vector/ such as YRT2 [see Schedl et aL (1993) Nuc. Acids Res. 
21:4783-4787], which contains the murine tyrosinase gene. The library 

10 is screened for hybridization to mammalian telomere and centromere 
sequence probes. Positive clones are isolated and micrpinjected into 
pronuclei of fertilized oocytes of NMRI/Han mice following standard 
techniques. The embryos are then transferred into NMRI/Han foster 
mothers. Expression of the tyrosinase gene in transgenic offspring 

15 confers an identifiable phenotype (pigmentation). The clones that give 
rise to tyrosinase-expressing transgenic mice are thus confirmed as 
containing functional mammalian artificial chromosome units. 

Alternatively, fragments of SATACs may be introduced into the 
YAC vectors and then introduced into pronuclei of fertilized oocytes of 

20 NMRI/Han mice following standard techniques as above. The clones 

that give rise to tyrosinase-expressing transgenic mice are thus confirmed 
as containing functional mammalian artificial chromosome units, 
particularly centromeres. 

C. Incorporation of Heterologous Genes into Mammalian Artificial 
25 Chromosomes through The Use of Honriology Targeting Vectors 

As described above, the use of mammalian artificial chromosomes 

for expression of heterologous genes obviates certain negative effects 

that may result from random integration of heterologous plasmid DNA 

into the recipient cell genome. An essential feature of the mammalian 

30 artificial chromosome that makes it a useful tool in avoiding the negative 
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tn recipient cells. Accordingly, methods of specific t«r . . 
heterologous genes exclusiv^^^ 
... . -artificial-chromosomeT without extFan.n- " ^ V "^^^^^ 

,Hr ^ K ^"^-'"^"^d mtegration into the artificial chromosome 
through a recombination event at =it» .\. omosome 
S and the chromosome The h ■ 

seiectabis marZ 7 '"^^""^ ^^""^ ""^V aiso contain 

vector intoralci ;? """''"'^ ""^ "^^^ 

- are e.^Lr r::::— ^ 

the recipient cell genome T -ntegrafon of the vector into 

and MCP 7 oTa , ^"^"'■"^^ ""-"'"SV targeting vectors. 

<ina p/ivi- 7-DTA, are described below. 

1. Construction of Vector ylCF.7 

cond.::::r;:::;::~ ~ 

-lece-encolg I lo " T'"'" """" 
artificial chromosomes fo us in genl: ™~ 
Which also contains the nu applications. This vector, 

marker, as he Sa' r'"'"^'""^^^'"° ' "'^"^"'^ 

Phosphate de"::^^^=^?==^--^^ 

follows. "^'^-^^I- was constructed in a series of steps as 
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a. Construction of pURA 
Plasmid pURA was prepared by ligating a 2.6-kb Sall/ Xho l 
fragment from the yeast artificial chronriosome vector pYAC5 [Sigma; see 
also Burke etaL (1987) Science 236 :806-81 2 for a description of YAG 
5 vectors as well as GenBank Accession no. U01086 for the complete 
sequence of pYACS] containing the S^ cerevisiae ura3 gene with a 3.3- 
kb Sall/ Sma l fragment of pHyg [see, e.g. . U.S. Patent Nos. 4,997,764, 
4,686,186 and 5,162,215,. and the description above]. Prior to ligation 
the Xhol end was treated with Klenow polymerase for blunt end ligation 

10 to the Sma l end of the 3,3 kb fragment of pHyyg. Thus, pURA contains 

the S. cerevisiae ura3 gene, and the E. coli ColEI origin of replication and f 
the ampicillin-resistance gene. The uraE gene is included to provide a ^; i 

means to recover the integrated construct from a mammalian cell as a v ' 

YAC clone. - ^ 

15 b. Construction of pUP2 

Plasmid pURA was digested with Sai l and ligated to a 1.5- 
kb Sai l fragment of pCEPUR. Plasmid pCEPUR is produced by ligating the 
1 .1 kb Sna BI- Nha l fragment of pBabe-puro [Morgeristern et aL (1990) 
NucL Acids Res. 18 :3587-3596: provided by Dr. L. Szekely 

20 (Microbiology and Tumorbiology Center, Karolinska Institutet, 

Stockholm); see, also Tonghua et aL (1995) Chin. Med. J. (Beijing, Engl. 
Ed.) 108 :653-659: Couto et aL (1994) Infect. Immun. 62 :2375-2378; 
Dunckley et aL ( 1 992) FEBS Lett. 296 : 1 28-34; French et aL ( 1 995) Anal. 
Biochem. 228 :354-355: Liu et aL (1995) Blood 85:1095-1103; 

25 International PCT application Nos. WO 9520044; WO 9500178, and WO 
941 9456] to the Nhe l- Nru l fragment of pCEP4 [Invitrogen]. 
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; ' - ; ^ '^^^ resulting plasmidr pUPij contains the all the elements of 

PURA plus the puromycin-resistance gene linked to the SV40 promoter 
: and polyadehylation signal frorn pCEPUR. " ' - 

5 The {intermediate plasmid PUP-CFTR was generated in order 

to combine the elements of pUP2 into a plasmid along with the CFTR ^ 
gene. First, a 4.5-kb Sail fragment of pCMV-CFTR that contains the 

■ ^ CFTR-encoding DNA [see, alsbrRiordan et aL (1 989) Science 245: 1 066- 
1073, U.S. Patent No. 5,240,846, and Genbank Accession no. M28668 

10 for the sequence of the CFTR gene] containing the CFTR gene only was 
l-gated to Xhol-digested pCEP4 [Invitrogen and also described herein] in 

order to insert the CFTR gene in the multiple cloning site of the Epstein 
Barr virus-based (EBV) vector pCEP4 [Invitrogen, San Diego/ CA; see also 
Yates etaL { 1985) Nature 313:812-815; see, also U.S. Patent No. 
15 5,468,615] between the CMV promoter and SV40 polyadenylation 
signal. The resulting plasmid was designated pCEP-CFTR. Plasmid 
r PCEP-CFTR was. then digested with Sail and the 5.8-kb fragment 
X containing the CFTR gene flanked by the CMV promoter and SV40 
polyadenylation signal was ligated to Sail-digested pUP2 to generate 
20- pUP-CFTR. Thus, pUP-CFTR contains all elements of pUP2 plus the 

CFTR gene linked to the CMV promoter and SV40 polyadenylation signal. 

Construction of >ICF-7 
; Plasmid pUP-CFTR was then linearized by partial digestion 

w.th EcoRI and the 13 kb fragment containing the CFTR gene was ligated 
25 withEcoRI-digested Charon 4AM [see Blattneret aL (1977) Sc^ 

196:161; Williams and Blattner ( 1979) J. ViroL 29:555 and Sambrook et 
^ ^'^^^^ Molecular Clonino. A l .hnr...^ , ^^^^ ^^^^ 

Spring HarbbrXaboratory Press, Volume n Section 2:18, for des^ 

of Charon 4AM]. The resulting vector, >CF8, contains the Charon 4AM 
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bacteriophage left arm, the CFTR gene linked to the CMV promoter and 
SV40 polyadenylatlon signal, the ura3 gene, the puromycin-reslstance 
gene linked to the SV40 promoter and polyadenylation signal, the 
thymidine kinase promoter [TK], the ColEI origin of replicaton, the 
5 amplicillan resistance gene and the Charon 4Ayl bacteriophage right arm. 
The ACF8 construct was then digested with Xhol and the resulting 27.1 
kb was ligated to the p.4kb Xho l/EcoRI fragment of pJBP86 [described 
below], containing the SV40 poly A signal and the EcoRI-digested Charon 
4A A right arm. The resulting vector /ICF-7 contains the Chai-on 4A A left 

10 arm, the CFTR encoding DNA linked to the CMV promoter and SV40 

polyA signal, the ura3 gene, the puromycin resistance gene linked to the 
SV40 promoter and polyA signal and the Charon 4A A right arm. The 
A DNA fragments provide encode sequences homologous to nucleotides 
present in the exemplary artificial chromosomes. 

15 The vector is then introduced into cells containing the artificial 

chromosomes exemplified herein. Accordingly, when the linear ACF-7 
vector IS introduced into megachromosome-carrying fusion cell lines, 
such as described herein, it will be specifically integrated into the 
megachromosome through recombination between the homologous 

20 bacteriophage A sequences of the vector and the artificial chromosome. 
2. Construction of Vector >ICF-7-DTA. : . - 
Vector ACF-7-pTA also contains all the elements contained in >iCF- 
7, but additionally contains a lethal selection marker, the diptheria toxin- 
A (DT-A) gene as well as the ampicillin-resistance gene and an origin of 

25 replication. This vector was constructed in a series of steps as follows. 
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;if\^^;-'; ' al ' Construction of •■pJBPSS"" ■ . ' 

Plasmid pJBP86 was used in the construction of /ICF-7, above. A 
; - 1<5-kb Sail fragment Of pCEPUR coiitaining the^uromycin-resistance 
„ . --gene linlced to the SV40 promoter and pblyadenylation signal w 
' 5 to Hindlli-digested pJB8 [see, e^, Ish-Horowitz et aL { 1 98 1 ) Nucleic 

Acids Res. 9:2989-2998; available from ATCC as Accession No. 37074; 
commercially available from Amersham, Arlington Heights/ IL]. Prior to 
> ligation the Sail ends of the 1.5 kb fragment of pCEPUR and th4 Hind lll 
? linearized pJB8 ends were treated with Klenow polymerase. The 
10 resulting vector pJBP86 contains the puromycin resistance gene linked to 
the SV40 promoter and polyA signal; the 1 .8 kb COS region of Charon 
4^ origin of replication and the ampicillin resistance gene. 

'b.-' Construction of pMEP-DTA 
> ^ 1- T-kb Xllol/gail fragment of pMCT-DT-A [see/e.d.. Maxwell et 

"•^^^ (1986) Cancer Res. 46:4660.4666] containing the diptheria toxin-A 
• s giene vyas ligated to >a2ol-diges ilhvitrogen, San Diego, CA] to 

t ^^^t gener To produce pMCI-DT-A, the coding region of the 

DTA gene was isolated as a 800 bp PstlHindlll fragment from p2249-1 
and inserted into pMCI neopolyA [pMCI available from Stratagene] in 
20 place of the neo gene and under the control of the TK promotoer. The 
resulting construct pMCI DT-A was digested with Hindlll, the ends filled 
by Klenow and Sail linkers were ligated to produce a 1 061 bp TK-DTA 
gene cassette with an JOiol end [5'} and a Sail end containihg the 270 bp 
TK promoter and the -790 bp DT-A fragment. This fragment was 
25 ligated into XlTol-digested pMEP4 . 

Plasmid pMEP-DTA thus contains the DT-A gene linked to the TK 
promoter and SV40, ColET origin of replication and the ampicillin- 
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c. Construction of pJB83-DTA9 

Plasmid pJB8 was digested with Hin dlll and Cla l and ligated 
with an oligonucleotide [see SEQ ID NOs. 7 and 8 for the sense and 
antisense strands of the oligonucleotide, respectively] to generate pJB83. 
5 The oligonucleotide that was ligated to Clal/ Hind lll-diqested pJB8 
contained the recognition sites of Swa L Pad and Srf I restriction 
endonucleases. These sites will permit ready linearization of the pACF-7- 
DTA construct. . 

Next, a 1 .4-kb Xho l/ Sal l fragment of pMEP-DTA, containing the 
10 DT-A gene was ligated to Sail-digested pJB83 to generate pJB83-DTA9. 

d. Construction of ilCF-7-DTA 

The 12-bp overhangs of ylCF-7 were removed by Mung bean 
nuclease and subsequent T4 polymerase treatments. The resulting 41.1- 
kb linear y4CF-7 vector was then ligated to pFB83-DTA9 which had been 

15 digested with Cla l and treated with T4 polymerase. The resulting vector, 
/ICF-7-DTA, contains all the elements of /lCF-7 as well as the DT-A gene 
linked to the TK promoter and the SV40 polyadenylation signal, the 
1 ,8 kB Charon 4A A COS region, the ampicllin-resistance gene[from 
pJB83-DTA9] and the Col ET origin of replication [from pJB83-DT9Al. 

20 D. Targeting vectors using luciferase markers: Plasmid pMCT-RUC 
Plasmid pMCT-RUC I14kbp] was constructed for site-specific 
targeting of the Renilla luciferase [see, e.g. . U.S. Patent Nos. 5,292,658 
and 5,418,155 for a description of DNA encoding Renilla luciferase, and 
plasmid pTZrLuc-T, which can provide the starting material for 

25 construction of such vectors] gene to a mammalian artificial 

chromosome. The relevant features of this plasmid are the Renilla 
luciferase gene under transcriptional control of the human ... 
cytomegalovirus immediate-early gene enhancer/promoter; the . 
hygromycin-resistance gene a, positive selectable marker, under the 
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:v : transcriptional contrc)! of the thymidine kinase promoter, jn particular, 
V this plasmid contains plasmid pAG60 [see, e^ UjS. Patent Nos 
- 5v118,620, 5,021,344, 5,063,162 and 4,946,952; see, ^Iso Colbert^ ^ 
_ _Garapin-et al-{ 1 98 1)-J: Mol/Biol:150ri -l^jr wfiidhlr^SitJdei DN A (iTeT" ' 
5 the neomycin-resistance gene) homologous to the minichromosome, as 
; w the HSV-tk gene 

- under control of the tk promoter as a negative selectabJ^ marker for 
V homologous recombination, and a unique Hfial site for linearizing the ; 
plasmid. / '^ ■ -..t..'.'- ..j^- -■: ^ ■ 

10 This construct was introduced, via calcium phosphate transfection, 

into EC3/7C5 ceils [see, Lorenz et aL (1996) J. Biolum. Chf»mil..m 
11:31-37]. The EC3/7C5 cells were maintained as a monolayer [see, ^ 
{ Gluznrian (1981) CeM 23:175-183]. Cells at 509^ confluency iri lOQmm 

Petri dishes were used for calcium phosphate transfection (see. Harper et 
15 aL ( 1 981) Chromosoma 83:43 1 -439] using TO pg of linearized pMCT- 
RUC per plate: Colonies originating from single transfected cells were 
Isolated and maintained in F-1 2 medium containing hygromycin (300 
//g/mL) and 10% fetal bovine serum. Cells were grown in 100 mm Petri 
dishes prior to the RenUla luciferase assay. 
20 ^e'^'^'^a 'uciferase assay was performed [see, e^,^ ^M^^ 

fit aL (1977) Biochemistry 16:85-91]. Hygromycln-reslst^nt cell lines 
obtained after transfection of EC3/7C5 cells with linearized plasnild 
pMCT-RUC ["B" cell lines] were grown to 100% confluency for measure- 
ments of light emission in yiyg and in yitro. Light emission was 
25 measured in yivo after about 30 generations as follows: growth medium 
was removed and replaced by 1 mL RPMI 1640 containing coelenterazine 
[1 mmol/L final concentration]. Light emission from cells was then 
visualized by placing the Petri dishes In a i6w light video imag analyzer 
[Hamamatsu Argus-100]. An image was formed after 5 min. of photon 
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accumulation using 100% sensitivity of the photon counting tube. For 
measuring light emission in vitro , cells were trypsinized and harvested 
from one Petri dish, pelleted, resuspended in ImL assay buffer [0.5 mol/L 
NaCI, 1 mmol/L EDTA, 0.1 mol/L potassium phosphate, pH 7,4] and 
5 sonicated on ice for 10 s. Lysates were than assayed in a Turner TD- 
20e luminometer for 10 s after rapid injection of 0.5 mL of 1 mmol/L 
coelenterazine, and the average yalue of light emission was recorded as 
LU [1 LU = 1.6 X 106 hu/s for this instrument]. 

Independent cell lines of EC3/7C5 cells transfected with linearized 

10 plasmid pMCT-RUC showed different levels of Renilla lucif erase activity. 
Similar differences in light emission were observed when measurements 
were performed on lysates of the same cell lines. This variation in light 
emission was probably due to a position effect resulting from the random 
integration of plasmid pMCT-RUC into the mouse genome/since 

15 enrichment for site targeting of the luciferase gene was not performed in 
this experiment. 

To obtain transfectant populations enriched in cells in which the 
luciferase gene had integrated into the minichromdsome, transfected cells 
were grown in the presence of ganciclovis. This negative selection 

20 medium selects against cells in which the added pMCT-RUG plasmid 
integrated into the host EC3/7C5 genome. This selection thereby 
enriches the surviving transfectant population with cells containing 
pMCT-RUC in the minichromosome. The celts surviving this selection 
were evaluated in luciferase assays which revealed a more uniform level 

25 of luciferase expression. Additionally, the results of in situ hybridization 
assays indicated that the Renilla luciferase gene was contained in the 
minichromosome in these cells, which further indicates successful 
targeting of pMCT-RUC into the minichromosome. 
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" ' PMCT-RUC which also contains vt 

DNA to provide an extended region of homology to the minichromosome 
[see, other targeting vectors, belowj, was also used to transfect 
- . _ _ - EC3/7C5 cells.^ Site-directed targeting df-th-e-/?en/7^a lucifira'se gVne and ~ 
5 the hygromycin-resistance gene in pNEM-T to the minichromosome in the 
recipient EC3/7C5 cells was achieved. This was verified by DNA 
amplification analysis and by ,>7 s/r^ hybridization. Additionally, luciferase 
gene expression was confirmed in luciferase assays of the transfectants. 
E- Protein secretion targeting vectors 

10 's°'3tion of heterologous proteins produced intracellularly in 

mammalian cell expression systems requires cell disruption under 
potentially harsh conditions and purification of the recombinant protein 
: from cellular contaminants. The process of protein isolation may be 

greatly facilitated by secretion of the recombinantly produced protein into 
15 the extracellular medium where there are fewer contaminants to remove 

dunng purification. Therefore, secretion targeting vectors have been 
- constructed for use with the mammalian artificial chromosome system. 

A useful model vector for demonstrating production and secretion 
of heterologous protein in mammalian cells contains DNA encoding a 
20 readily detectable reporter protein fused to an efficient secretion signal 
that directs transport of the protein to the cell membrane and secretion 
of the protein from the cell. Vectors pLNCX-ILRUC and pLNCX-ILRUd 
described below, are examples of such vectors. These vectors contain' 
DNA encoding an interleukin-2 (IL2) signal peptide -Renilla reniformi. 
25 luciferase fusion protein. The IL-2 signal peptide (encoded by the 

sequence set forth in SEQ ID No. 9J directs secretion of the luciferase 
protein, to which it is linked, from mammalian cells. Upon secretion from 
--- - -the host-mammalian cell, the lL-2 sigh^^^^^^^ 

fusion protein to deliver mature, active, luciferase protein to the 
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extracellular medium. Successful production and secretion of this 
heterologous protein can be readily detected by performing luciferase 
assays which measure the light emitted upon exposure of the medium to 
the bioluminescent luciferin substrate of the luciferase enzyme. 
5 Thus, this feature will be useful when artificial chromosomes are used for 
gene therapy. The presence of a functional artificial chromosome 
carrying an IL-Ruc fusion with the accompanying therapeutic genes will 
be readily monitored. Body fluids or tissues can be sampled and tested 
for luciferase expression by adding luciferin and appropriate cofactors 

10 and observing the bioluminescence. 

1 . Construction of Protein Secretion Vector pLNCX-ILRUC 
Vector pLNCX-ILRUC contains a human lL-2 signal peptide- R. reniformis 
fusion gene linked to the human cytomegalovirus (CMV) immediate early 
promoter for constitutive expression of the gene in mammalian cells. The 

15 construct was prepared as follows. ■ 

a. Preparation of the iL-2 signal sequence-encoding DNA 
A 69-bp DNA fragment containing DNA encoding the human IL-2 
signal peptide was obtained through nucleic acid amplification, using 
appropriate primers for IL-2, of an HEK 293 cell line [see, e.a. . U.S. 

20 Patent No. 4,518,584 for an IL-2 encoding DNA; see, also SEQ ID No. 
9; the IL-2 gene and corresponding amino acid sequence is also provided 
in the Genbank Sequence Database as accession nos. K02056 and 
J002641, The signal peptide includes the first 20 amino acids shown in 
the translations provided in both of these Genbank entries and in SEQ ID 

25 NO. 9. The corresponding nucleotide sequence encoding the first 20 

amino acids is also provided in these entries [see, e.g., nucleotides 293- 
52 of accession .no. K02056 and nucleotides 478-537 of accession no. 
J00264), as well as in SEQ ID NO. 9. The amplification primers included 
an Eco Rl site [GAATTC] for subcloning of the DNA fragment after 
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1^Ba«6H into'^GEMT tPrbmega,.^ '°'ward printer is se„or,h in SEQ ,D 
""""Z" °' ^^ r^"-- Primer is se, for* in SEQ ID No. 




10 



^>-.wJTG forward [SEQ ID No 111 
^^°*ATTGAGTAQGTGCACTGTTTGTGAC revserse ,SEQ ID No. 12) 
b. Preparation of the B^ tsnifSTn^luoiferase^nco 

The initial source of the R, rsmjamja luoiferase gene was 
Plasmid pLXSN-RUC. Vector pLXSN [see, fl^, U.S. Patent Nos 
5.324,655, 5,470,.730, 5,468.634, 5,358.866 and Miller sliL (1989) 
afiteitoiaass 7:980] is a ^^^^^^ 

heterologous DNA under the transcriptional control of the retroviral LTR- 
It also contains the neomycin-resistance gene operatlvely linlced for 
^ ^ expression to *e SV40 early region promoter. The B^ rani^ 
IS '"^''erase gene was obtained from plasmid pT2rLuc.1 Isee, U S 
Patent No. 5,292.653; see also the G^nbank Sequence Database ' 
accession no. IV163501; and see also Lorenz a aL (1991) ElS^^ 

Apg^l. jVi I ^ /V 88:4438-44421 and is shown as SEQ ID NO 10 The 
20 '-3--' Of PTZrLuo-1 contains the coding region of 

the />en,«s luciferase-encodig DNA. Vector pLXSN was digested wid. 
and Lgated with the luoiferase gene contained on a pLXSN-RUC. which 
contains the luoiferase gene located operably linked to the viral LTR and 

upstream of the SV40 promoter. Which directs expression of the 
neomycin-resistance gene. 

ttie^ renifoimis Luoiferase Gen^ 

The pGEMT vector containing the IL-2 signal peptide-encoding 
"^f^A d-escnb-ed ir, r:a;Vbove was digested With 
30 fragment encoding the signal peptide was Ugated to EcoRI-digested_ _ 
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pLXSN-RUC. The resulting plasmid, called pLXSN-ILRUC, contains the 

IL-2 signal peptide-encoding DNA located immediately upstream of the R. 

reniformis gene in pLXSN-RUC. Plasmid pLXSN-ILRUC was then used as 

a template for nucleic acid amplification of the fusion gene in order to 

5 add a Sma l site at the 3' end of the fusion gene. The amplification 

product was subcloned into linearized [EcoRI/Smal-digestedl pGEMT 

[Promega] to generate ILRUC-pGEMT, 

d. Introduction of the Fusion Gene into a Vector 
Containing Control Elements for Expression in 
10 Mammalian Cells 

Plasmid ILRUC-pGEMT was digested with Ksp l and Smal to 

release a fragment containing the IL-2 signal peptide-luciferase fusion 

gene which was ligated to Hpa l-digested pLNCX. Vector pLNCX [see, 

e.g., U.S. Patent Nos. 5,324,655 and 5,457,182; see, also Miller and 

15 Rosman (1 989) Biotechnioues 7:980-990} is a retroviral vector for 

expressing heterologous DNA under the control of the CMV prompter; it 
also contains the neomycin-resistance gene under the transcriptional 
control of a viral promoter. The vector resulting from the ligation 
reaction was designated pLNCX-ILRUC. Vector pLNCX-ILRUC contains 

20 the IL-2 signal peptide-luciferase fusion gene located immediately 

downstream of the CMV promoter and upstream of the viral 3' LTR and 
polyadenylation signal in pLNCX. This arrangement provides for 
expression of the fusion gene under the control of the CMV promoter. 
Placement of the heterologous protein-encoding DNA [ue^, the lucif erase 

25 gene] in operative linkage with the IL-2 signal peptide-encoding DNA 

provides for expression of the fusion in mammalian cells transfected with 
the vector such that the heterologous protein is secreted from the host 
cell into the extracellular medium. 
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2. Con«ruc«„„ Of Protein Secretion Targeting Vector pLNCX- 

^ •^■vector pLNCX-lLRUC may be modified so *at. it can be used to 

" — — incorporation Of 

the PLNCX-ILRUC expression vector into a mammalian artificial 
Chromosome, nucleic acid sequences that are homolcgous to nucleotides 

Site directed recombination. 

D^.A th"?^ ^''^-''--^ ^--'"ed herein contain . phage 

DNA. Therefore, protein secretion targeting vector pLNCX-lLRUO. was 
prepared by addition of . phage DNA (from Charon 4A arms, to produce 
ine secretion vector pLNCX-ILRUC; 

"^"^ 'LMTK from the XtcCI were transiently 
20 mrpir' , 0^g, by electroporation 

20 miORAp. performed according to the manufacturer-s instructions!. Stable 
^ansfectants produced by growth in G4, 8 for nec selection have also 
been prepared. 

Trahsfeotants were grown and then analy«^ 
lucferase. To determine whether active lucif erase was secreted from the 
transfected cells, culture media were assayed for luciferase by addition 
of coelentrazine [see. s^. Matthews at flU (1 977) Sachemistry 
12:85-911. ~ 

The results of these assays establish that vector pLNCX-lLRUC is 
P'''»'''''"9 ""=««ive expression of heterologous DN^^^ 
mammalian host cells. Furthermore. ti,e results demonstrate that tt,e 
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human IL-2 signal peptide is capable of directing secretion of proteins 

fused to the C-terminus of the peptide. Additionally, these data 

demonstrate that the reniformis luciferase protein is a highly effective 

reporter molecule, which is stable in a mammalian cell environment, and 

5 forms the basis of a sensitive, facile assay for gene expression. 

b. Renilla reniformis luciferase appears to be secreted 
from LMTK cells. 

(i) Renilla luciferase assay of cell pellets 
The following cells were tested: 
10 cells with no vector: LMTK" cells without vector as a negative 

control; 

cells transf acted with pLNCX only; 

cells transfected with RUC-pLNCX [Renilla luciferase gene in 
pLNCX vector]; 

15 cells transfected with pLNCX-ILRUC [vector containing the IL-2 

leader sequence + /?e/7/7/a luciferase fusion gene in pLNCX vector! 
Forty-eight hours after electroporation, the cells and culture? 
medium were collected. The cell pellet from 4 plates of cells was 
resuspended in 1 ml assay buffer and was lysed by sonication. Two 

20 hundred //I of the resuspended cell pellet was used for each assay for 

luciferase activity (see, e.g. . Matthews etaL (1977) Biochemistry 16 :85- 
91]. The assay was repeated three times and the average 
bioluminesc^nce measurement was obtained. 

The results showed that there was relatively low background 

25 biolunfiinescence in the cells transforrried with pLNCX or the negative 
control cells; there was a low level observed in the cell pellet from cells 
containing the vector with the IL-2 leader sequence-luciferase gene 
fusion and more than 5000 RLU in the sample from cells containing RUC 
pLNCX. 
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(ii) Renilla luclferase assay of cell medium 

Forty milliliters of medium from 4 plates of cells were harvested - 
i and spun down. Two hundred microliters of medium was used for each 
,_, luaferase^tivitvcassay.. The assay was repeated seve^ 
5 average b.oluminescence measurement was obtained: These results 
, showed that a relatively high level of bioluminescence was detected in 
the cell medium from cells transformed with pLNCX-ILRUC; about 10-fold 
lower levels [slightly above the background levels in medium from cells 

w,th no vector or transfected with pLNCX only] was detected in the cells 
10 transfected with RUC-pLNCX. 

(iii) conclusions 
The results of these experiments demonstrated that Renilla 
.^^ J^^^ "PP^^'-^ ^^ ^r^^c^eted from LMTK- cells under the direction of 

. the IL-2 signal peptide. The medium from cells transfected with Renilla ' 
15 luciferase-encoding DNA linked teethe DNA encoding the IL-2 secretion ■ 
;^ signal had substantially higher levels of. Renilla luciferase activity than 
controls or cells containing luciferase-encoding DNA without the signal 
: peptide-encoding DNA. Also, the differences between the controls and 
cells containing luciferase encoding-DNA demonstrate that the luciferase 
20 activity is specifically from luciferase, not from a non-specific reaction. 
In addition, the results from the medium of RUC-pLNCX transfected cells 
which is similar to background.- show that the luciferase activity in the 
medium does not come from cell lysis, but from secreted luciferase. 
25 i^^P^ession of reni^^ Luciferase Using pLNCX- 

To express the IL-2 signal peptide-R. reniform.^ fusion gene from 
an mammalian artificial chromosome, vector pLNCX-ILRUC/l is targeted 
^.^^^^^^'''^'''.^f^ a mammalian artificial. chromosome J 1 - 

through homologous recombination of the>1 DNA sequences contained in 
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the chromosome and the vector. This is accomplished by introduction of 
pLNCX-ILRUCyl into either a fusion cell line harboring mammalian artificial 
chromosomes or mammalian host cells that contain mammalian artificial 
chromosomes. If the vector is introduced into a fusion cell line harboring 
5 the artificial chromosomes, for example through microinjection of the 
vector or transfection of the fusion cell line with the vector, the cells are 
then grown under selective conditions. The artificial chromosomes, 
which have incorporated vector pLNCX-ILRUC/4, are isolated from the 
surviving cells, using purification procedures as described above, and 

10 then injected into the mammalian host cells. 

Alternatively, the mammalian host cells may first be injected with 
mammalian artificial chromosomes which have been isolated from a 
fusion cell line. The host cells are then transfected with vector pLNCX- 
ILRUCM and grown. - 

15 The recombinant host cells are then assayed for luciferase ■ 

expression as described above, - 
F. Other targeting vectors 

These vectors,; which are based on vector pMCT-RUC, rely on 
positive and negative selection to insure insertion and selection for the 

20 double recombinants. A single crossover results in incorporation of the 
DT-A, which kills the cell, double crossover recombinations delete the 
DT-1 gene. 



1. Plasmid pNEMI contains: 

DT-A: Diphtheria toxin gene {negative selectable marker) 

25 Hyg: Hygromycin gene {positive selectable marker) 

rue: Renilla luciferase gene {non-selectable marker) 

1: LTR-MMTV promoter 

2: TK promoter 

3: CMV promoter 
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^ : ; '^'^'^^^ ^ ^ H pAG60) 

: ^^'^"^"^ pNEM-2 and -3 are similar to pfSlEM 1 except for 

-f^!^^MzlJ- _ djp_htheria_toxin. gene. as. :!---'L selectable-marker - - - - 

5 ; pNEM-2: hygromycin antisense gene as "-" selectable marker 
pNEM-3: thymidine kinase HSV-1 gene as selectable marker 
3. Plasmid - yl DMA based homology: 
pNEIS/W-1: base vector 

pNEM/t-2: base vector containing p5 = gene 
"•0 1: LTR MMTV promoter 

2:. SV40 promoter 
3: CMV promoter 

4: //TIIA promoter (metallothionein gene promoter) 
— homology region. (plasmid pAGSP) 
IB A L.A. andyl R.A. homology regions for y1 left and right arms 

{A gt-WES). 

EXAMPLE 13 

Microinjection of mammaiian cells with plasmid DNA 

These procedures will be used to microinject MACS into eukaryotic 
20 cells, including mammalian and insect cells. 

The microinjection technique is based on the use of small glass 
capillaries as a delivery system into cells and has been used for 
introduction of DNA fragments into nuclei [see, e.g.. Chalfie et aL (1994) 
Science 263:802-804]. It allows the transfer of almost any type of 
25 molecules, e^, hormones, proteins, DNA and RNA, into either the 
cytoplasm or nuclei of recipient cells This technique has no cell type 
restriction and is more efficient than other methods, including 
..Ca^t.— mediated g ne transfer and liposome-mediated-gene transfer. 
About 20-30% of the injected cells becom successfully transformed. 
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Microinjection is performed under a phase-contrast microscope. A 
glass microcapillary, prefilled with the DNA sample, is directed into a cell 
to be injected with the aid of a micromanipulator. An appropriate sample 
volume (1-10 pi) is transferred into the cell by gentle air pressure exerted 
5 by a transjector connected to the capillary. Recipient cells are grown on 
glass slides imprinted with numbered squares for convenient localization 
of the injected cells. 

a. Materials and equipment 

Nunclon tissue culture dishes 35 x 10 mm, mouse cell line EC3/7C5 
10 Plasmid DNA pCH1 10 [Pharmacia], Purified Green Florescent Protein 

(GFP) [GFPs from Aequorea and Renilla have been purified and also DNA 
encoding GFPs has been cloned; see, e.g. . Prasher et aL (1 992) Gene 
111:229-233; International PCT Application No. WO 95/07463, which is 
based on U.S. application Serial No, 08/119,678 and U.S. application 
15 Serial No. 08/192,274], ZEISS Axiovert 100 microscope, Eppendorf 

transjector 5246, Eppendorf micromanipulator 5171, Eppendorf Cellocate 
coverslips, Eppendorf microloaders, Eppendorf femtotips and other 
standard equipment 

b. Protocol for injecting 

20 (1) Fibroblast cells are grown in 35 mm 

tissue culture dishes (37° C, 5% COj) until the cell density reaches 80% 
confluency. The dishes are removed from the incubator and medium is 
added to about a 5 mm depth. 

(2) The dish is placed onto the dish holder 
25 and the cells observed with 10 x objective; the focus is desirably above 

the cell surface. 

(3) Plasmid or chromosomal DNA solution 

(1 ng/>L/l] and GFP protein solution are further purified by centrifuging the 
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DNA sample at a force sufficient to remove any particular deSris 
[typically about 1 0,000 rpm for 1 0 minutes in a mibrobWntrifugel ' 

^: (4) ^ Two 2 //I of the DNA solution (1 h^/Ot/l) is 

_ ._u-Joaded jnto a^^^^ 

5 loading, the loader is inserted to the tip end of the microcapillary. GFP 

^^^^^ : . n 

(5) The protecting sheath is removed from the 
microcapillary and the microcapillary is fixed onto the capillary holder 
connected with the micromanipulator. 

(6) The capillary tip is lowered to the surface 
of the medium and is focussed on the cells gradually until the tip of the 
capillary reaches the surface of a celL The capillary is lowered further so 

, - that the it is inserted into the cell. Various parameters, such as the level 
V ' of the capillary, the time and pressure, are determined for the particular 
15 equipment. For example, using the fibroblast cell line C5 and the above- 
: ^ noted equipment, the best conditions are: , injection time 0i4 second, 
. pressure 80 psi. DNA can then be automatically injected into the nuclei 
■ ■ ,■ -of thexells. ■ ■ v.- 

(7) i After injection, the cells are returned to 
20 the incubator, and incubated for about 18-24 hours. 

(8) After incubation the number of 
transformants.can be determined by a suitable method, which depends 
upon the selection marker. For example, if green fluorescent protein is 
used, the assay can be performed using UV light source and fluorescent 

25 filter set at 0-24 hours after injection. If iff-gal-containing DNA, such as 
DNA-derived from pHC1 10, has been injected, then the transformants 
can be assayed for /?-gal. 
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(c) Detection of A-galactosidase in cells injected 
with plasmid DNA _ 

The medium is removed from the culture plate and the cells are 

- fixed by addition of 5 ml of fixation Solution I: (1% glutaraldehyde; 0.1 

5 M sodium phosphate buffer, pH 7.0; 1 mM MgClj), and incubated for 15 

minutes at 37° C; Fixation Solution I is replaced with 5 ml of X-gal 

Solution II: [0.2% X-gal/ 1 0 mM sodium phosphate buffer (pH 7.0), 150 

mM NaCI, 1 mM MgCIa, 3.3 mM k4Fe(CN)6H20, 3.3 mM KaFeCCNjel, and 

the plates are incubated for 30-60 minutes at 37° G. The X-gal solution 

10 is removed and 2 ml of 70% glycerol is added to each dish. Blue 

stained cells are identified under a light microscope. 

This method will be used to introduce a MAC, particularly the 

MAC with the anti-HIV megachromosome, to produce a mouse model for 

anti-HIV activity. 

15 EXAMPLE 14 

Transgenic (non-human) animals 

Transgenic (non-human) animals can be generated that express 

heterologous genes which confer desired traits, e.g. , disease resistance, 

in the animals. A transgenic mouse is prepared to serve as a model of a 

20 disease-resistant animal. Genes that encode vaccines or that encode 

therapeutic molecules can be introduced into embryos or ES cells to 

produce animals that express the gene product and thereby are resistant 

to or less susceptible to a particular disorder. 

The mammalian artificial megachromosome and others of the 

25 artificial chromosomes, particularly the SATACs, can be used to generat 

transgenic (non-human) animals, including mammals and birds, that 

stably express genes conferring desired traits, such as genes conferring 

resistance to pathogenic viruses. The artificial chromosomes can also be 
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■Sim-:'' used to produce transgenic {non-htjman) aninrialS, such as pigs, that can 
. produce imniunologically hurnaniM^^ for xenotransplantation. 

^ For example, transgenic mice containing a transgene encodlhg an 
_ - --antirHIV-ribozyme provide-a useful model-for the-developmenT^^^ 
5 transgenic (non-human) animals using these methods. The artificial 
chromosomes can be used to produce transgenic (non-human) animals, 
particularly, cows, goats, mice, oxen, camels, pigs and sheep, that 
produce the proteins of interest in their milk; and to produce transgenic 

chickens and other egg-producing fowl, that produce therapeutic proteins 
10 or other proteins of interest in their eggs. For example, use of mammary 
gland-specific promoters for expression of heterologous DNA in milk is 
known [see, e^ U.S. Patent No. 4,873,316]. In particular, a 
milk-specific promoter or a promoter, preferably linked to a milk-specific 
signal peptide, specifically activated in mammary tissue is operatively 
15 linked to the DNA of interest, thereby providing expression of that DNA 
sequence in milk. 

1. Development of Control Transgenic Mice Expressing Anti- 
HIV RIbozyme 

Control transgenic mice are generated in order to compare stability 

20 and amounts of transgene expression in mice developed using transgene 

DNA carried on a vector (control mice) with expression in mice developed 

using transgenes carried in an artificial megachromosome. 

a. Development of Control Transgenic Mice Expressing 
i?-gaiactosidase 

25 One set of control transgenic mice was generated by 

microinjection of mouse embryos with the >ff-galactosidase gene alone 
The microinjection procedure used to introduce the plasmid DNA into the 
mouse embryos is as described in Example 13, but modified for use with 
embryos [see, e^, Hogan et aL (T994) Manipulating the Mouse Embryo, 

30 A .Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring 
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Harbor, fslY, see, especially pages 255-264 and Appendix 3]. Fertilized 
mouse embryos [Strain CB6 obtained from Charles River Go.] were 
injected with 1 ng of plasmid pCH1 10 (Pharmacia) which had been 
linearized by digestion with Bam HI. This plasmid contains the 
5 galactosidase gene linked to the SV40 late promoter. The 

galactosidase gene product provides a readily detectable marker for 
successful transgene expression. Furthermore, these control mice 
provide confirmation of the microinjection procedure used to introduce 
the plasmid into the embryos. Additionally, because the mega- 

10 chromosome that is transferred to the mouse embryos in the model 

system (see below) also contains the )ff-galactosidase gene, the control 
transgenic mice that have been generated by injection of pCH1 10 into 
embryos serve as an analogous system for comparison of heterologous 
gene expression from a plasmid versus from a gene carried on an artifical 

15 chromosome. 

After injection, the embryos are cultured in modified HTF medium 
under 5% COj at 37^C for one day until they divide to form two.cells. 
- . The two-cell embryos are then implanted into surrogate mother female 
mice [for procedures see, Manioulatino the Mouse Embryo, A Laboratorv 

20 Manual (1994) Hogan et aL, eds.. Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY> pp. 127 et sea. l. 

b- Development of Control Transgenic Mice Expressing 
Antl-HIV RIbozyme 

One set of anti-HIV ribozyme gene-containing control transgenic 

25 mice was generated by microinjection of mouse embryos with plasmjd 

pCEPUR-132 which contains three different genes: (1) DNA encoding an 

anti-HIV ribozyme, (2) the puromycin-resistance gene and (3) the 

hygromycin-resistance gene. Plasmid pCEPUR-132 was constructed by 

ligating portions of plasmid pCEP-132 containing the anti-HIV ribozyme 
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^^^^^^ ■ • g (referred to as ribozyme D by Chang et aL [(1990) Clin. Biotech 
2:23-3r]- see also U.S. Patent No. 5,144,019 to Rossi etaL-, particu- 
larly Figure 4 of the patent) and the hygromycin-resistance gene with a 

portion of plasmid pGEPUR containing the puroWcih-resiitance 

5 Plasmid pCEP- 132 was constructed as follows. Vector pCEP4 

(Invitrogen, San Diego, CA; see also Yates et aL (1 985) Nature 31 3:81 2- 
815) was digested with Xhol which cleaves in the multiple cloning site 
region of the vector. This ~ 10.4-kb vector contains the hygromycin- 
resistance gene linked to the thymidine kinase gene promoter and 
10 polyadenylation signal, as well as the ampicillin-resistance gene and 
ColEI origin of replication and EBNA-1 (Epstein-Barr virus nuclear 
antigen) genes and OriP. The multiple cloning site is flanked by the 
cytomegalovirus promoter and SV40 polyadenylation signal. 

Xhol-digested pCEP4 was ligated with a fragment obtained by 
15 digestion of plasmid 132 (see Example 4 for a description of this plasmid) 
with Xhor and Sail. This Xho 

ribozyme gene linked at the 3' end to the SV40 polyadenylation signal. 
The plasmid resulting from this ligation was designated pCEP-132. Thus, 
in effect, pCEP-1 32 comprises pCEP4 with the anti-HIV ribozyme gene 
20 and SV40 polyadenylation signal inserted in the multiple cloning site for 
CMV promoter-driven expression of the anti-HIV ribozyme gene. 

To generate pCEPUR-132, pCEP-132 was ligated with a fragment 
of pCEPUR. pCEPUR was prepared by ligating a 7.7-kb fragment 
generated upon Nhel/Nrul digestion of pCEP4 with a 1 . 1-kb Nhel/SnaBI 
25 fragment of pBabe [see Morgenstern and Land (1990) Nucleic Acids Ras 
1 8:3587-3596 for a description of pBabe] that contains the puromycin- 
resistance gene linked at the 5' end to the SV40 promoter. Thus, 
pCEPUR is made up of the ampicillin-resistance and EBN A 1 genes, as 
well as the GolEI and OriP elements from pCEP4 and the puromycin- 
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resistance gene from pBabe. The puromycin-resistance gene in pCEPUR 
is flanked by the SV40 promoter (from pBabe) at the 5' end and the 
SV40 polyadenylation signal (from pCEP4) at the 3' end. 

Plasmid pCEPUR was digested with Xho l and Sai l and the 
5 fragment containing the puromycin-resistance gene linked at the 5' end 
to the SV40 promoter was ligated with Xho l-diqested pCEP-132 to yield 
the -12.1-kb plasmid designated pCEPUR-1 32. Thus, pCEPUR-1 32, in 
effect, comprises pCEP-1 32 with puromycin-resistance gene and SV40 
promoter inserted at the Xho l site. The main elements of pCEPUR-132 

10 are the hygromycin-resistance gene linked to the thymidine kinase 

promoter and polyadenyiation signal, the anti-HIV ribozyme gene linked 
to the CMV promoter and SV40 polyadenyiation signal, and the 
puromycin-resistance gene linked to the SV40 promoter and 
polyadenyiation signal. The plasmid also contains the ampicillin- 

15 resistance and EBNAl genes and the ColEI origin of replication and OriP. 

Zygotes were prepared from {C57BL/6JxCBA/J) F1 female>mice 
[see, e.g. . Manipulating the Mouse Embrvo, A Laboratory Manual (1994) 
Hogah et aL, eds.. Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, p. 429], which had been previously mated with a 

20 {C57BL/6JxCBA/J) F1 male. The male pronuclei of these F2 zygotes 
were injected (see. Manipulating the Mouse Embrvo, A Laboratory 
Manual (1 994) Hogan elaL, eds.. Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY] with pCEPUR-132 (-3 //g/itil), which had been 
linearized by digestion with Nru L The injected eggs were then implanted 

25 in surrogate mother female mice for development into transgenic 
offspring. 

These primary carrier offspring were analyzed (as described below) 
for the presence of the transgene in DNA isolated from tail cells. Seven 
carrier mice that contained transgenes in their tail cells (but that may not 
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carry the transgene in all their cells, i.e. , they may be chimeric) were 
allowed to mate to produce non-chimeric or germ-line heterozygotes. 
The heterozygotes were, in turn, crossed to generate homozygote 
transgenic offspring.. _ 1 _ _ , . . „ . - :^ : _ . ^ . . 



2. Development of Model Transgenic Mice Using 

Mammalian Artificial Chromosomes 
Fertilized mouse embryos are microinjected (as described above) 
with megachromosomes (1-10 pL containing 0-1 chromosomes/pL) iso- 
lated from fusion cell line G3D5 or H1D3 (described above). The 
megachromosomes are isolated as described herein. Megachromosomes 
isolated from either cell line carry the anti-HIV ribozyme (ribozyme D) 
gene as well as the hygromycin-resistance and ;ff-galactosidase genes. 

The injected embryos are then developed into transgenic mice as 
described above. 

15 Alternatively, the megachromosome-containing cell line G3D5* or 

H1D3* is fused with mouse embryonic stem cells [see, e.g.. U.S. Patent 
No. 5,453,357, commerically available; see Manipulatina the Mnusfl 
Embryo, A Laboratory Manual (1 994) Hogan et aL. eds.. Cold Spring 
Harbor Laboratory Press, Cold Spring Harbor, NY, pages 253-289] 
20 following standard procedures see also, e.g.; Guide to Techniques in 

Mouse Development in Methods in Enzvmoloov Vol. 25, Wassarman and 
De Pamphilis, eds. (1993), pages 803-932J. (It is also possible to deliver 
isolated megachromosomes into embryonic stem cells using the Microcell 
procedure [such as that described above].) The stem cells are cultured in 
25 the presence of a fibroblast [e^, STO fibroblasts that are resistant to 

hygromycin and puromycin]. Cells of the resultant fusion cell line, which 
contains megachromosomes carrying the transgenes [i.e.. anti-HIV 

>ff-galactosidase genes] , are then 
, transplanted into mouse blastocysts, vyhich are in turn implanted into a 
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surrogate mother female mouse where development into a transgenic 

mouse will occur. 

Mice generated by this method are chimeric; the transgenes will be 

expressed in only certain areas of the mouse, e.g^ the head, and thus 
5 may not be expressed in all cells. 

3, Analysis of Transgenic Mice for Transgene Expression 
Beginning when the transgenic mice, generated as described 

above, are three-to-four weeks old, they can be analyzed for stable , 

expression of the transgenes that were transferred into the embryos [or ^ | 

10 fertilized eggsl from which they develop. The transgenic mice may be rf 

analyzed in several ways as follows. ? If 

a. Analysis of Ceils Obtained from the Transgenic ' ''^f 

Mice . 

Cell samples [ e.g. , spleen, liver and kidney cells, lymphocytes, tail 

15 cells] are obtained from the transgenic mice. Any cells may be tested for , | 

transgene expression. If, however, the mice are chimeras generated by v^^ji 

microinjection of fertilized eggs or by fusion of embryonic stem cells with 

megachromosome-containing cells, only cells from areas of the mouse j 

that carry the transgene are expected to express the transgene. If the 

20 cells survive growth on hygromycin [or hygromycin and puromycin or 

neornycin, if the cells are obtained from mice generated by transfer of 

both antibiotic-resistance genes], this is one indication that they are 

stably expressing the transgenes. RNA isolated from the cells according 

to standard methods may also be analyzed by northern blot procedures 

25 to determine if the cells express transcripts that hybridize to nucleic acid 

probes based on the antibiotic-resistance genes. Additionally, cells 

obtained from the transgenic mice may also be analyzed for /S- 

galactosidase expression using standard assays for this marker enzyme 

[for example, by direct staining of the product of a reaction involving fi- 
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: galactosid 

5:3133-3 1 42. or by measurement of yS^galactosidase activity, see, e^; 
Miller (1972) in Experiments in Mr>l^o..i^r p|S -^-j- 111 r,j|j ' 

_ 1 _ - Spring Harbor Press]. Analysis of ^.galactosidase expression is " ' ^ 
5 particularly used to evaluate transgene expression in cells obtained from 
control transgenic mice in' which the only transgene transferred into the 
embryo was the ;ff-galactosidase gene. 

Stable expression of the anti-HIV ribozyme gene in cells obtained 
from the transgenic mice may be evaluated in several ways! First, DNA - 

10 isolated from the cells according to standard procedures may be^^ ^^^^ 

subjected to nucleic acid amplification using primers corresponding to the 
ribozyme gene sequence. ^ If the gene is contained within the cells, an 
amplified product of pre-determined size is detected upon hybridization of 
i . the reaction mixture to a nucleic acid probe based on the ribozyme gene'^ 
15 sequence. Furthernnore, DNA isolated from the cells may be analyzed 
using Southern blot methods for hybridization to such a nucleic acid 
probe. Second, RNA isolated from the cells may be subjected to 
northern blot hybridization to determine if the cells express RNA that ■ Z 
y hybridizes to nucleic acid probes based on the ribozyme gene^ Third, the 
20 cells may be analyzed for the presence of anti^HIV ribozyme activity as : ^ 
described, for example, in Chang et aL (1 990) Clin. Biotech, 9-9r^-r:ti lo ■ 
this analysis, RNA isolated from the cells is mixed with radioactively 
labeled HIV gag target RNA which can be obtained by in yitro 
transcription of gag gene template under reaction conditions favorable to 
25 in vitro cleavage of the gag target, such as those described in Chang et 
aL (1990) Clin. Biotech. 2:23-31 . After the reaction has been stopped, 
the mixture is analyzed by gel electrophoresis to determine if cl avage 
. -products smaller in size than the whole terfiplate are detected; presence" 
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of such cleavage fragments is indicative of the presence of stably 
* expressed ribozy me.. 

b. Analysis of Whole Transgenic Mice 
Whole transgenic mice that have been generated by transfer of the 
5 anti-HIV ribozyme gene [as well as selection and marker genes] into 
embryos or fertilized eggs can additionally be analyzed for transgene 
expression by challenging the mice with infection with HIV. It is possible 
for mice to be Infected with HIV upon intraperitoneal injection with 
high-producing HIV-infected U937 cells [see; e.g., Locardi et aL (1992) 

10 J. Virol. 66:1 649-1654]. Successful infection may be confirmed by 
analysis of DNA isolated frorh cells, such as peripheral blood 
mononuclear cells, obtained from transgenic mice that have been inject d 
with HIV-infected human cells. The DNA of infected transgenic mice 
cells will contain HIV-specific gag and env sequences, as demonstrated 

15 by, for example, nucleic acid amplification using HIV-specific primers. If 
the cells also stably express the anti-HIV ribozyme, then analysis.of RNA 
extracts of the cells should reveal the smaller gag fragments arising by 
cleavage of the gag transcript by the ribozyme. 

Additionally, the transgenic mice carrying the anti-HIV ribozyme 

20 gene can be crossed with transgenic mice expressing human CD4 (i.e., 
the cellular receptor for HIV) [see Gillespie et aL (1993) Mol. Cell. Biol. 
13:2952-2958; Hanna et (1994) Moi. Cell. Biol. 14:1084-1094; and 
Yeung et aL (1994) J. Exo, Med. 180:1911-1920, for a description of 
transgenic mice expressing human CD4]. The offspring of these crossed 

25 transgenic mice expressing both the CD4 and anti-HIV ribozyme 
transgenes should be more resistant to infection [as a result of a 
reduction in the levels of active HIV in the cells] than mice expressing 
CD4 alpn (without expressing anti-HIV ribozyme]. 
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chromosomes 

V . ; J^^A^''^^?^.'^^^ many applications in 

■^t^^ ?®^ '^?'^®!^1°L*?°^®^*'£ P°V!*^ ai agi^u'tural speciesjof_„ 
5 commercial significance, such as disease resistance genes and genes 

encoding therapeutic proteins. It appears that efforts in the area of 
: ' chicken transgenesis have been hampered due to difficulty in achieving 
stable expression of transgenes in chicken cells using conventional 
methods of gene transfer via random introduction into recipient cells. 
; 10 Artificial chromosomes are, therefore, particularly useful in the 

- : • v chickens because they provide for stable 

Sf "^3'ntenance of transgenes in host cells. 

a. Preparation of artificial chromosomes for introduction 
^ transgenes into recipient chicken cells 

!S ^ Mammalian artificial chromosomes 

chromosomes, such as the SATACs and 
minichromosomes described herein, can be modified to incorporate 
detectable reporter genes and/or transgenes of interest for use in 
developing transgenic chickens. Alternatively, chicken-specific artifical 
20 chromosomes can be constructed using the methods herein. In 

particular, chicken artificial chromosomes [CACs] can be prepared using 
r r*^® methods herein for preparing MACs; or, as described above, the 
chicken llbrarires can be introduced into MACs provided herein and the 
resulting MACs introduced into chicken cells and those that are 
25 functional in chicken cells selected. 

As described in Examples 4 and 7, and elsewhere herein, artificial 
chromosome-containing mouse LMTKTderived cell lines, or 
minichromosom -containing cell lines, as well as hybrids thereof, can.be 
- transfectecl with sele^^^ 
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integrated the foreign DNA for functional expression of heterologous 
genes contained within the DNA. 

To generate MACs or CACs containing transgenes to be expressed 
in chicken cells, the MAC-containing cell lines may be transfected with 
5 DNA that includes A DNA and transgenes of interest operably linked to a 
promoter that is capable of driving expression of genes in chicken cells. 
Alternatively, the rninichromosomes or MACs [or CACs], produced as 
described above, can be isolated and introduced into celts, followed by 
targeted integration of selected DNA. Vectors for targeted integration 

10 are provided herein or can be constructed as described herein. 

Promoters of interest include constitutive, inducible and tissue (or 
cetl)-specific promoters known to those of skill in the art to promote 
expression of genes in chicken cells. For example, expression of the lacZ 
gene in chicken blastodermal cells and primary chicken fibroblasts has 

15 been demonstrated using a mouse heat-shock protein 68 (hsp 68) 

promoter [phspPTIacZpA; see Brazolot et aL (1 991 ) Mol. ReprodaDevel. 
30:304-31 21. a Zn^*-inducible chicken metallothionein (cMt) promoter 
[pCBcMtlacZ; see Brazolot etaL (1991) Mol. Reorod. Devel. 30 :304- 
31 2], the constitutive Rous sarcoma virus and chicken >ff-actin promoters 

20 in tandem [pmiwZ; see Brazolot et aL (1991) Mol. Reprod. Devel. 

30:304-3121 and the constitutive cytomegalovirus (CMV) promoter. Of 
particular interest herein are egg-specific promoters that are derived from 
genes, such as ovalbumin and lysozyme, that are expressed in eggs. 
The choice of promoter will depend on a variety of factors, 

25 including, for example, whether the transgene product is to be expressed 
throughout the transgenic chicken or restricted to certain locations, such 
as the egg. Cell-specific promoters functional in chickens include the 
steroid-responsive promoter of the egg ovalbumin protein-encoding gene 
[see Gaub et aL (1987) EMBO 6:231 3-2320; Tora et aL (1988) EMBO 
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' til) - Chicken artificial chromosomes 

^ " - - Additionally, chicken artificial-chromosomes may be'-g^nerated ~ 

5 using methods described herein. For example, chicken cells, such as 

primary chicken fibroblasts [see Brazolot et aL (1 991 ) Mol. Reorod. 

Bevel. 30:304-312], may be trahsfected with ON A that encodes a 

selectable marker {such as a protein that confers resistance to 

antibiotics] and that includes DNA (such as chicken satellite DNA) that 

1 0 targets the Introduced DNA to the pericentric region of the endogenous 

chicken chromosomes. Transfectants that survive growth on selectiori 

medium are then analyzed, using methods described herein; for the 

presence of artificial chromosomes, including minichrombsomes, and ^ 

^ particularly SATACs: An artificial chromosome-containing transfectant~ 

15 cell line may then be transfected with DNA encoding the transgene of 

Interest [fused to-an appropriate promoter] along with DNA that targets 

the foreign DNA to the chicken artificial chromosome. 

^- 'ntroduction of artificial chromosomes carrying 

transgenes of interest into recipient chicken cells 

^? Cell lines containing artificial chromosomes that harbor 

transgene(s) of interest (i.e., donor cells) may be fused with recipient 
chicken cells In order to transffsr the chromosomes into the recipient 
cells. Alternatively, the artificial chromosomes may be Isolated from the 
donor cells, for example, using methods described herein [see, e.a.. 
25 Example 10], and directly Introduced into recipient cells. 

Exemplary chicken recipient cell lines include, but are not limited 
to, stage X blastoderm cells [see, e.g. . Brazolot et aL (1 991) Mol. 
J^^BIM^ HSIU 30: 304-3 12; Etches et aL_ (1 993) Poultrv Sci 7?!afi7-ftfl.q: 
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Petitte et aL (1990) Development 108 :185-1891 and chick zygotes [see, 
e.g. . Love et aL (1994) Biotechnology 1 2 :60-631. 

For example, microcell fusion Js one method for Introduction of 
artificial chromosomes into avian cells [see, e g, . Dieken et aL [(1 996) 
5 Nature Genet. 12 : 1 74-1 82 for methods of fusing microcells with DT40 
chicken pre-B cells]. In this method, microcells are prepared [for 
example, using procedures described in Example 1. A. 5] from the artificial 
chromosome-containing cell lines and fused with chicken recipient cells. 
Isolated artificial chromosomes may be directly introducedrinto 

10 chicken recipient celt lines through, for example, lipid-mediated carrier ; 
systems, such as lippfection procedures [see, e.g. . Brazolot et al. (1991) 
Mol. Reprod, Dev. 30 :304-3121 or direct microinjection. Microinjection is 
generally preferred for introduction of the artificial chromosomes into 
chicken zygotes [see, e.g. . Love et al. (1994) Biotechnology 12 :60-631. 

15 c. Development of transgenic chickens 

Transgenic chickens may be developed by injecting recipient ySjage 
X blastoderm cells (which have received the artificial chromosomes) into 
embryos at a similar stage of development [see, e.g. . Etches et aj^ 
(1993) Poultry Sci 72 :882-889: Petitte et al. (1990) Development 

20 108:185-189; and Carsience et at. (1993) Development 1 17: 669-675]. 
The recipient chicken embryos within the shell are candled and allowed 
to hatch to yield a germline chimeric chicken that will express the 
transgene(s) in some of its cells. - 

Alternatively, the artificial chromosomes may be introduced into 

25 chick zygotes, for example through direct microinjection [see/ e.g. . Love 
et aL (1994) Biotechnology Tg: 60-63], which thereby are incorporated 
into at least a portion of the cells in the. chicken. Inclusion of a tissue- 
specific promoter, such an an egg-specific promoter, will ensure 
appropriate expression of operatively-linked heterologous DNA. 
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- x^^f" ; ^ V the DNA of interest may also be introduced into a : 

minichronnosome, by methods provided herein. The minichromosome 
may either be one provided herein, or one generated in chicken ceils 

. ^ using-the methods herein: -The heterologous DNA will be ihtrbduc ~ 
5 using a targeting vector, such as those provided herein, or constructed 
as provided herein. 

Since modifications will be apparent to those of skill in this art, it 
is intended that this invention be limited only by the scope of the 
10 appended claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION 
(i) APPLICANT: 

(A) NAME: The Biological Research Center of the Hungarian 
Academy of Sciences 

(B) STREET: Post Office Box 521 • 

(C) CITY: H-e701 Szeged 

(D) STATE: 

(E) COUNTRY: Hungary - 

(F) POSTAL CODE (ZIP): 

(i) APPLICANT: 

(A) NAME: Loma Linda University 

(B) STREET: , 

(C) CITY: Loma Linda 

(D) STATE: California 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 92350 

(i) APPLICANT: 

(A) NAME: AMERICAN GENE THERAPY, INC. 

(B) STREET: 5022 154 Street 

(C) CITY: Edmonton, T6H5PE 

(D) STATE: 

(E) COUNTRY: CANADA . 

(F) POSTAL CODE (ZIP): 

(i) INVENTOR/APPLICANT: 

(A) NAME: Gyula Hadlaczky 

(B) STREET: H.6723, Szeged 

(C) CITY: Szamos U.l.A. IX 36 

(D) STATE: 

(D) COUNTRY: Hungary 

(E) POSTAL CODE (ZIP) : 

(i) INVENTOR/APPLICANT: \ / 

(A) NAME: Aladar Szalay 

(B) STREET: 7327 Fairwood , 

(C) CITY: Highland 

(D) STATE: California 

(D) COUNTRY: USA 

(E) POSTAL CODE (ZIP) : 92346 



(ii) TITLE OF THE INVENTION: ARTIFICIAL CHROMOSOMES, USES THEREOF 
AND METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES 

(iii) NUMBER OF SEQUENCES: 34 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Brown, Martin, Haller & McClain 

(B) STREET: 1660 Union Street 

(C) CITY: San Diego 

(D) STATE: CA 

(E) COUNTRY: USA 

(F) ZIP: 92101-2926 
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> (v) COMPUTER READABLE- FORM: ; . 

(A) - MEDIUM TYPE: Diskette ' . ..- :^ 

(B) COMPUTER: IBM Compatible 
' (C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ Version 1.5 ' , . . - 

_ _ (vi^ CURRENT APPLICATION DATA: \ _ _ ^ I: _ ^_ ^ _ „ _ 

^ (A) APPLICATION NUMBER: 

(B) FILING DATE: 10-04-1997 

(C) CLASSIFICATION: 

(vi) PRIOR APPLICATION DATA: - . ^ 

(A) APPLICATION NUMBER: 08/695,191 

(B) FILING DATE: 07-AUG-1996 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/682,080 

(B) FILING DATE: 15-JUL-1996 

(C) CLASSIFICATION: 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/629,822 

(B) FILING DATE: lO-APR-1996 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Seidman, Stephanie L 

(B) REGISTRATION NUMBER: 33,779 

(C) REFERENCE /DOCKET NUMBER: 6869-402PC 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 619-238-0999 

(B) TELEFAX: 619-238-0062 

(C) TELEX: 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1293 base pairs — 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(xi) SEQUENCE DESCRIPTION:- SEQ ID NO : 1 : 

GAATTCATCA TTTTTCANGT CCTCAAGTGG ATGTTTCTCA TTTNCCATGA TTTTAAGTTT 60 

TCTCGCCATA TTCCTGGTCC TACAGTGTGC ATTTCTCCAT TTTNCACGTT TTNCAGTGAT 120 

TTCGTCATTT TCAAGTCCTC AAGTGGATGT TTCTCATTTN CCATGAATTT CAGTTTTCTN 180 

GCCATATTCC ACGTCCTACA GNGGACATTT CTAAATTTNC CACCTTTTTC AGTTTTCCTC 240 

GCCATATTTC ACGTCCTAAA ATGTGTATTT CTCGTTTNCC GTGATTTTCA GTTTTCTCGC 300 

^^ATTCCAG GTCCTATAAT GTGCATTTCT CATTTNNCAC GTTTTTCAGT GATTTCGTCA 360 

TTTTTTCAAG TCGGCAAGTG GATGTTTCTC ATTTNCCATG ATTTNCAGTT TTCTTGNAAT 420 

ATTCCATGTC CTACAATGAT CATTTTTAAT TTTCCACCTT rTCATTTTTC CACGCCATAT^ „ 4 80 
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TTCATGTCCT 
CAGGTCCTAC 

, AGTCGTGAAG 
TCCTACAGTG 

: TGCTAAAGTG 
CTAATAGTGT 
GTCAAGGGGA 
ACAGTGACAT 
AAGTATATAT 
GTGTGCATTT 
GGATGTTTCT 
ACATTTCTAA 
TGTATTTCTA 
CATTTGrrCAT 



AAAGTGTATA 
AGTGTGCATT 
TGGATCTTTC 
GACATTTCTA 
TGTATTTCTT 
GCATTTCTCA 
TGTTTCTCAT 
TTCTAAATAT 
TTCTCATTTT 
CTCATTTTTG 
CATTTTC CAT 
ATTATCCACC 
ATTTTCAGTG 
TTTTCACGTT 



TTTCTCCTTT 
CCTCATTTTT 
TAATTTTCCA 
AATTTTCCAA 
ATTTTCCGTG 
TTTTTGACGT 
TTTCCATGAG 
TATACCTTTT 
CCCTGATTTT 
ACGTTTTTCA 
GATTTTCAGT 
TTTTTCAGTT 
ATTTTCAGTT 
TTTCAGTGAA 



TCCGCGATTT 
CACCTTTTTC 
TGATTTTCAG 
CTTTTTCAAT 
ATTTTCAGTT 
TTTTCAGTGA 
TGTCAGTTTT 
TCAGTTTTTC 
CAGTTTCCTT 
GTAATTTCTT 
TTTCTTGCCA 
TTTCATCGGC 
TTCTCGGCAT 
TTC . 



TGAGTTTTCT 
ACTGATTTCG 
TTATCTTGTC 
TTTTCTCGAC 
TTCTCGCCAT 
TTTCGTCATT 
CTTGCTATAT 
TCACCATATT 
GCCATATTCC 
CATTTTTTAA 
TATACCATGT 
ACATTTCACG 
ATTCCAGGAC 



CGCCATATTC 

TCATTTTTCA 

ATATTCCATG, 

ATATTTGACG 

ATTCCAGGTC 

TTTTCCAGTT 

TCCATGTCCT 

TCACGTCCTA 

AGGTCCTACA 

GCCCTCAAAT 

CCTACAGTGG 

TCCTAAAGTG 

CTACAGTGTG 



540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1293 



^ (2). INFORMATION FOR SEQ ID NO:2: 

" <i) SEQUENCE CHARACTERISTICS : . ^ . 

^ ; V (A) LENGTH: 1044 base pairs 

(B) TYPE: nucleic acid / \ 
. , (C)^ STRANDEDNESS :' single ■ i 
<D) TOPOLOGY: linear . . ^ 

(ii) MOLECULE TYPE: Genomic DNA 
: ^(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO : 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
' 7^: (ix) FEATURE: , 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

AGGCCTATGG TGAAAAAGGA AAT ATCTTCC CCTGAAAACT AGACAGAAGG ATTCTCAGAA ^^60 

TCTTATTTGT GATGTGCGCC CCTCAACTAA CAGTGTTGAA GCTTTCTTTT GATAGAGCAG ti20 

TTTTGAAACA CTCTTTTTGT AAAATCTGCA AGAGGATATT TGGATAGCTT TGAGGATTTC 130 

CGTTGGAAAC GGGATTGTCT ' TCATATAAAC CCTAGACAGA AGCATTCTCA GAAGCTTCAT 24 0 

TGGGATGTTT CAGTTGAAGT CACAGTGTTG AACAGTCCCC TTTCATAGAG CAGGTTTGAA 300 

ACACTCTTTT TTGTAGTATC TGGAAGTGGA CATTTGGAGC GATCTCAGGA CTGCGGTGAA . 360 

AAAGGAAATA TCTTCCAATA AAAGCTAGAT AGAGGCAATG TCAGAAACCT TTTTCATGAT . 420 

GTATCTACTC AGCTAACAGA GTTGAACCTT CCTTTGAGAG AGCAGTTTTG AAACACTCTT , 480 

TTTGTGGAAT CTGCAAGTGG ATATTTGTCT AGCTTTGAGG ATTTCGTTGG GAAACGGGAT 540 

TACATATAAA AAGCAGACAG GAGCATTCCC AGAAACTTCT TTGTGATGTT TGCATTCAAG 600 

TCACAGAGTT GAACATTCCC TTTCATAGAG CAGGTTTGAA ACACACTTTT TGATGTATCT 660 

GGATGTGGAC ATTTGCAGCG CTTTCAGGCC TAAGGTGAAA AGGAAATATC TTCCCCTGAA 72 O; 

AACTAGACAG AAGCATTCTC AGAAACTTAT TTGTGATGTG CGCGCTCAAC TAACAGTGTT 780 

GAAGCTTTGT TTTCATAGAG GCAGTtTTGA AACACTCTTT TGTGGAATCT GCAAGTGGAT 840 

ATTTGTCTAG CTTTGAGGAT TTCTTTGGAA ACGGGATTAC ATATAAAAAG CAGACAG CAG 900 

CATTCCCAGA ATCTTGTTTG TGATGTTTGC ATTCAAGTCA CAGAGTTGAA CATTCC CTT T 960 

CAGAGAGCAG GTTTGAACAC TCTTTTTATA GTATCTGGAT GTGGACATTT GGAGCGCTTt 1020 

CAGGGGGGAT CCTCTAGAAT TCCT 1044 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 92 base pairs 

(B) TYPE: nucleic acid 

■ { C ) STRANDEDNESS : s ingl e^ 
(D) TOPOLOGY: linear 
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( if) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE- 
(ix) FEATURE: 



60 



(XX) " SEQUENCE DESCRIPTION :* SEQ ID NO : 3^: ^^^^^ ^ ^ ^ ^ ^ ^ ^ : " 

GGATCTATGG GGGTGGGGAG aSgCcSStc TAGGACTACC TGGGGGCTGT :^ 120 

•GAGGGTCTGA GGAACATAGA GCTGGCpf^o ^oS^^S^^ GAAGAGACAA GGTGGCCTGA 180 
AATGGGACAG GCTtSS^? A???SS?2 ctSI^cS^? ^^'=^°' GGAAGTGAGG 

TGCTATCCTG GGGTTCAACC CCCCAGGTTG TAGCAAGGAG GGCTTGGGGT 300 

ATTACAATGG ACACAGGAG^ SSSJS?? TCsSrSn^ GGGAGATGGT CCCAGGACAT ' 360 

AGACCATGAG TAGGGGTGTC SotEJS?? SJJSSr^ ^TGCCAAGAG 420 

AGGGCCCCTG CTGCCACCTA GrGGCTGirr r^^lnft^E.':^ GCTGCATTGT TCAAATCCAA 480 

TAGGGTCTCT GTGAAGACCA l^^lr T^l^^l GACCCTGGGC CACACGCGTT 54 0 

TTTCCACCTA TTCGAAACAA TCaSJSS Jcca^^™ GACTCCTAAA TGAGCAGAGA 600 

CTAAGGCTAG GGATAGGG?G GGaSIJSJ? f^^"^"^^ GGGGATGGCA 660 

GATCAACGTT GGTTAGGAGT TAGGGATAcI r^InnJZ^S^ GTAAGGGGTT TAGGGTTAGG 720 

TTAGGGGTTA GGGTTAGGG? TaSSS^^JS r™^™^ GGTAGGGTTA GGGGTTAGGG 780 

GGGTTAGGTT TTGGGgIgGC SSSSS? GGGGTTAGGG GTTAGGGTTA 840 

AGAGTTCTTG TTTItEcTT? aSSJI???2? P GTGTTCCACT GGCAATGAAA 900 

GATATAGACC AGCTGTGCTA TCTC^J^^^r ^^EIIIF^ AGAGTTTAGC AATTCTAACA 960 

ATGTGTTTAC TTGcStSS ?SS????? ^^^^JJ. GTAACCACAT TGTGGTTTCA 1020 

.CATOTCTTGN NTTTNGGCTG TTT^cSS J^^I^S TGTCTGTTCA GATGTGTGTG 1080 

-GAAGACAAAT CTTTCTCAGA TGTCTaSg ofll^^S^J TAATAATTTT TTATATATTT 1140 

TGTCTCTAAC AAGGTCTC^ SgaSSJ? S^™? TTCAATATGA GGCTTGCTTT 1200 

TTTTGTGTAT ATCTACCTTT TGTGTcJ??? S^It^^ CTGTCACTTC. 1260 

ATAGCTTTTC TTCTATTGTT TCTT?S^ ^^"^"^ CCAAA6GCAG 1320 

GATGATTTTG AGTGATTATT TGTGTAAGTT G^fn^^ TTTTGCATTT TTAGTGTAAG 1380 

CTTATGGTTT CCAATTAATC GTTCCCrrl^ xI^^T^ CGTCTATATC CATATCATTT 1440 
TTTGTTAGAG TAGaSgGTA cSSA^Sr I^IT^^^ AAAGACACAG GATAGTGGGC 
GTCTGGGAAG GCtScctS aSga?JSJ?2 Jtfa^^?^ GGCCTCCTGG AAAAGGGAAA 
ctggagtgga tgggcaCttg tSaa?5SSS A^^JI^^ TATTAGTAGC ATCTCTAGTG 
GAAACTCCCT AGAACTCCTC tSaS?g?2 AAAGAGGTCC. TATGCAGAAA 

gaatattgct agctacatgc tcXtS^SS ACTCTGCAAT AAAAATGTCA 

CCATAAGTAC AGATTAGGGr AotJ^^^?^? AAAGGGGACA TTCTTAAGTG AAACCTGGCA 

acgtgatcg? cS^^^^ ggcaggcgca gtaggtacS 

GTATTGATCA CCACACATAT ACC?cS?^ ^^^^^^ GCTGGTGCCA GAGTGGATTC 1920 

GGCAAGTTGG GGAGCT^ J^otS^S^ JSfrr^^ GGTCCCACAA GCCTAAGTGG 1980 

TGAGACAGAG GCAGGAAtSt GAAgSS??? JS^^ftf^S f^GAAAACA GGTGGAGACT 2040 

GCTGTTTAAT GGATCGCTCA ctScSJS?? TCCCTGCACA GGACTCTTAG 2100 

CTGTGTTTCT TTTCAATGAA GTSlScT^rr IS^S^T" TCTACAATAA ACTCTTTACA 2160 

TCTTCCAAGT TAAACAAGAA cSSSS5r a^^^^SI^^ TGCCTCTTGG TGAAAATGTT 2220 

TTGAATTTAC AGAACTGATG SctJS^Ia AGTAATAGCT CCGTTTCAGT 2280 

CCGTCACACC GGGACCAaS ctSJotIS^ GACTTTAGTG GTGCAGGAGG 2340 

CCTCGACACT GACAgSaS GGG??S^^nA ^^^l^^'^^^ CTGCCCGCAG GTGGCGGCTG 2400 

ACQACTACAC TG^iSJ^G SSSSSJ JJGTCCCCAG CTGCCAGCAG GGGGCGTACG 246^ 



( 2 ) INFORMATION FOR SEQ ID NO : 4 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

^ . A^^L T^PE : Genomic* DNA^ 1 - 



1500 
1560 
1620 
1680 
1740 
1800 
1860 
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(iii) HYPOTHETICAL: NO - 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
GGGGAATTCA TTGGGATGTT TCAGTTGA 28 
(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs . 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: " y 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
CGAAAGTCCC CCCTAGGAGA TCTTAAGGA .29 
(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs A: - 

(B) TYPE: nucleic acid . "'^ 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: RNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: ■ . 

(vi) ORIGINAL SOURCE: . . 
(ix) FEATURE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 
CCGCTTAATA CTCTGATGAG TCCGTGAGGA CGAAACGCTC TCGCACC 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 



wo 97/40183 

PCiyUS97/05911 



27 



: V^; (v) FRAGMENT TYPE: - ™ ^ . ^.^ ■ . ... 

^ /; ^ ^ (vi) ORIGINAI.. SOURCE: ^ ^ - \ _ 

^ (ix). FEATURE . . - ' ' 

. . (Xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: ; - f :;. - : ^ ' ^- , . 0/ . 
CGATTTAAAT TAATTAAGCC CGGGC • 

(2) INFORMATION. FOR SEQ ID NO:8 : \ , 

(i) SEQUENCE CHARACTERISTICS- ■ 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid v - : 

(C) STRANDEDNESS : single * " ^ 

(D) TOPOLOGY:, linear ■ ' ,■ 

Ui) MOLECULE TYPE: Genomic DNA 

(ill) HYPOTHETICAL: NO T ■ : 

(iv) ANTISENSE: NO - - 

(v) FRAGMEITT TYPE: :C - / \ 

(vi) ORIGINAL SOURCE: : v ; ^ ' 
(ix) FEATURE: 

(xi) SEQUENCE DESCRIPTION: SEQ id: NO: 8: ^ ' C/ V^- ' ^ 

TAAATTTAAT TAATTCGGGC CCGTCGA \ ■ ' 

{2} INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 69 base pairs V- V : 

(B) TYPE: nucleic acid ^ : ^^^^^^ ~ : . - : 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA ' 
(D) OTHER INFORMATION IL-2 signal sequence 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO- 9- 

.ss ss s; s s sss s s s s s 

GTC ACA AAC AGT GCA CCT ACT ' ^ - 

Val Thr Asn Ser Ala Pro Thr 

(2) INFORMATION FOR SEQ ID NO:.10: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 945 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) . TOPOLOGY: linear 

V : (vi) ORIGINAL SOURCE: 
(ix) FEATURE: 



48 
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(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 1. . .942 

(D) OTHER INFORMATION: Renilla Reinformis Lucif erase 

(X) PUBLICATION INFORMATION: 

PATENT NO. : 5,418,155 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

AGC TTA AAG ATG ACT TCG AAA GTT TAT GAT CCA GAA CAA AGG AAA CGG 48 
Ser Leu Lys Met Thr Ser Lys Val Tyr Asp Pro Glu Gin Arg Lys Arg 
1 5 10 15 

ATG ATA ACT GGT CCG CAG TGG TGG GCC AGA TGT AAA CAA ATG AAT GTT 96 
Met lie Thr Gly Pro Gin Trp Trp Ala Arg Cys Lys Gin Met Asn Val 
20 25 30 

CTT GAT TCA TTT ATT AAT TAT TAT GAT TCA GAA AAA CAT GCA GAA AAT 144 
Leu Asp Ser Plie lie Asn Tyr Tyr Asp Ser Glu Lys His Ala Glu Asn 
35 40 45 

GCT GTT ATT TTT TTA CAT GGT AAC GCG GCC TCT TCT TAT TTA TGG CGA ^ 192 
Ala Val lie Phe Leu His Gly Asn Ala Ala Ser Ser Tyr Leu Trp Arg 
50 55 60 

CAT GTT GTG CCA CAT ATT GAG CCA GTA GCG CGG TGT ATT ATA CCA GAT 240 
His Val Val Pro His lie Glu Pro Val Ala Arg Cys lie lie Pro Asp ''j-. 
SB 70 75 .80 ' 

CTT ATT GGT ATG GGC AAA TCA GGC AAA TCT GGT AAT GGT TCT TAT AGG 288 
Leu lie Gly Met Gly Lys Ser Gly Lys Ser Gly Asn Gly Ser Tyr Arg 
85 90 95 

TTA CTT GAT CAT TAG AAA TAT CTT ACT GCA TGG TTG AAC TTC TTA ATT 1^^336 
Leu Leu Asp His Tyr Lys Tyr Leu Thr Ala Trp Leu Asn Phe Leu lie 
100 ' 105 lid 



TAG CAA AGA AGA TCA TTT TTT GTC GGC CAT GAT TGG GGT GCT TGT TTG 384 
Tyr Gin Arg Arg Ser Phe Phe Val Gly His Asp Trp Gly Ala Cys Leu 
115 120 125 

GCA TTT CAT TAT AGC TAT GAG CAT CAA GAT AAG ATC AAA GCA ATA GTT 432 
Ala Phe His Tyr Ser Tyr Glu His Gin Asp Lys lie Lys Ala lie Val 
130 135 140 

CAC GCT GAA AGT GTA GTA GAT GTG ATT GAA TCA TGG GAT GAA TGG CCT 480 
His Ala Glu Ser Val Val Asp Val lie Glu Ser Trp Asp Glu Trp Pro 
145 150 155 160 

GAT ATT GAA GAA GAT ATT GCG TTG ATC AAA TCT GAA GAA GGA.GAA AAA 528 
Asp lie Glu Glu Asp lie Ala Leu lie Lys Ser Glu Glu Gly Glu Lys 
165 170 175 . 

ATG GTT TTG GAG AAT AAC TTC TTC GTG GAA ACC ATG TTG CCA TCA AAA 576 
Met Val Leu Glu Asn Asn Phe Phe Val Glu Thr Met Leu Pro Ser Lys 
ieo 185 190 

ATC ATG AGA AAG TTA GAA CCA GAA GAA TTT GCA GCA TAT CTT GAA CCA 624 
lie Met Arg Lys Leu Glu Pro Glu Glu Phe Ala Ala Tyr Leu Glu Pro 
195 200 205 



■3^:^^^^-J:::.r:^ . . _ . , , PCI7US97AI5911 

■■■^■'-^^^ -202- 

13^ f*^ GGT GAA GTT CGT CGT CCA ACA TTA TCA TGG CCT COT 672 

Phe Lys Glu Lys Gly Glu Val Arg Arg Pro Tlir Leu Ser Pro to 

' ' 215'. 220 ;■■ 

GAA ATG CCG TTA GTA AAA GGT GGT AAA CCT GAC GTT GTA CAA ATT GTT ^^n 
: Glu lie Pro Leu val I^s Gly Gly Lys Pro Asp Val Val Gin xil SJi /"^^ 
■ 230 235 240 

^ ^ AAT TATAAT^OT AGT GAT GAT TTA CGA AAA ATG^' 768 

Arg Asn Tyr Asn Ala Tyr Leu Arg Ala Ser Asp Asp Leu Pro L^ Me? 

^ - — -~T - - ~~T24-5 ^ 250- — 7 255" ~" ' ^ "'^ 

^ I^^ CCA GGA TTC TTT TCC AAT GCT ATT GTT GAA GGC 8X6 

Phe lie Glu ser Asp Pro Gly Phe Phe Ser Asn Ala lie Val Glu Glv 
260 265 270 

GCC AAG AAG TTT CCT AAT ACT GAA TTT GTC AAA GTA AAA GGT CTT CAT 864 
Ala Lys Lys Phe Pro Asn Thr Glu Phe Val Lys Val Lys Gly Leu His 
; 275 280 285 

SI If^ ^ ^ GCA CCT GAT GAA ATG GGA AAA TAT ATC AAA TCG 912 

fon ^^"^ Met Gly Lys Tyr He Lys Ser 

295 300 

TTC GTT GAG CGA GTT CTC AAA AAT GAA CAA TAA q^c: 
Phe Val Glu Arg Val Leu Lys Asn Glu Gin 
.305 310 

(2) INFORMATION FOR SEQ ID NOzll: 

^ SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 
V ■ ' (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA ' ^ 

(iii) HYPOTHETICAL: NO 
^ (iv) ANTISENSE: NO 

(v) FRACaiENT TYPE: 
: {^r±) ORIGINAL SOURCE: ^ ^ 

(ix) FEATURE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 
TTTGAATTC A TGTACAGGAT GCAACTCCTG 3q 
V- . ^^^^^^ INFORMATION FOR SEQ ID NO: 12: 

. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRA01ENT TYPE: 
/ (vi) ORIGINAL SOURCE: 

ir.'^'^^—Cix)— FEATURE':"'^ " ""^ ' : ^ ~ " ""^ ■ ^ 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TTTGAATTCA GTAGGTGCAC TGTTTGTCAC .'".30 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1434 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single " 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO. . ; 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 

CCTCCACGCA CGTTGTGATA TGTAGATGAT AATCATTATC AGAGCAGCGT TGGGGGATAA 60 

TGTCGACATT TCCACTCCCA ATGACGGTGA TGTATAATGC TCAAGTATTC TCCTGCTTTT 120 

TTACCACTAA CTAGGAACTG GGTTTGGCCT TAATTCAGAC AGCCTTGGCT CTGTCTGGAC 180 

AGGTCCAGAC GACTGACACC ATTAACACTT TGTCAGCCTC AGTGACTACA GTCATAGATG . 24 0^ 

AACAGGCCTC AGCTAATGTC AAGATACAGA GAGGTCTCAT GCTGGTTAAT CAACTCATAG 300 

ATCTTGTCCA GATACAACTA GATGTATTAT GACAAATAAC TCAGCAGGGA. TGTGAACAAA 360' 

AGTTTCCGGG ATTGTGTGTT ATTTCCATTC AGTATGTTAA ATTTACTAGG ACAGCTAATT 420 

TGTCAAAAAG TCTTTTTCAG TATATGTTAC AGAATTGGAT GGCTGAATTT GAACAGATCC 480 

TTCGGGAATT GAGACTTCAG GTCAACTCCA CGCGCTTGGA CCTGTCGCTG ACCAAAGGAT 54 0 

TACCCAATTG GATCTCCTCA GCATTTTCTT TCTTTAAAAA ATGGGTGGGA TTAATATTAT 600 

TTGGAGATAC ACTTTGCTGT GGATTAGTGT TGCTTCTTTG ATTGGTCTGT AAGCTTAAGG 660 

CCCAAACTAG GAGAGACAAG GTGGTTATTG CCCAGGCGCT TGCAGGACTA GAACATGGAG 720 

CTTCCCCTGA TATATGGTTA TCTATGCTTA GGCAATAGGT CGCTGGCCAC TCAGCTCTTA 780. 

TATCCCACGA- GGCTAGTCTC ATTGTACGGG ATAGAGTGAG TGTGCTTCAG CAGCCCGAGA 840 

GAGTTGCAAG GCTAAGCACT GCAATGGAAA GGCTCTGCGG CATATATGTG CCTATTCTAG . 900 

GGGGACATGT CATCTTTCAT GAAGGTTCAG TGTCCTAGTT CCCTTCCCCC AGGCAAAACG 960 

ACACGGGAGC AGGTCAGGGT TGCTCTGGGT AAAAGCCTGT GAGCCTGGGA GCTAATCCTG 1020 

TACATGGCTC CTTTACCTAC ACACTGGGGA TTTGACCTCT ATCTCCACTC TCATTAATAT 108 0 

GGGTGGCCTA TTTGCTCTTA TTAAAAGGAA AGGGGGAGAT GTTGGGAGCC GCGCCCACAT 1140 

TCGCCGTTAC AAGATGGCGC TGACAGCTGT GTTCTAAGTG GTAAACAAAT AATCTGCGCA 1200 

TGTGCCGAGG GTGGTTCTTC ACTCCATGTG CTCTGCCTTC CCCGTGACGT CAACTCGGCC 1260 

GATGGGCTGC AGCCAATCAG GGAGTGACAC GTCCTAGGCG AAGGAGAATT CTCCTTAATA 1320 

GGGACGGGGT TTCGTTCTCT CTCTCTCTCT TGCTTCTCTC TCTTGCTTTT TCGCTCTCTT 1380 

GCTTCCCGTA AAGTGATAAT GATTATCATC TACATATCAC AACGTGCGTG. GAGG 1434 

(2) INFORMATION FOR SEQ ID NO : 14 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1400 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14 : 
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CCTCCACGCA 
TGTCGACATT 
TTACCACTAA 
AGGTCCAGAT 
TTCCGGGATT 
CAAAAAGTCT 
GGGAATTGAG 
" CCAATTGGAT 
GAGATACACT 
AAACTAGGAG 
CCCCTGATAT 
AGGCTAGTCT 
GGCTAAGCAC 
TCATCTTTCA 
CAGGTCAGGG 
CCTTTACCTA 
ATTTGCTCTT 
CAAGATGGCG 
GGTGGTTCTT 
CAG TCAATCA 
TTTCGTTTTC 
TGTAAGAATA 
TGAGAACGCG 
. TATCACAACG 



CGTTGTGATA 
TCCACTCCCA 
CTAGGAACTG 
ACAAGTAGAT 
GCGTGTTATT 
TTTCCAGTAT 
ACTTCAGGTC- 

ctcctcagca 
ttgctgtgga 
Agacaaggtg 

ATCTATGCTT 
CATTGCACGG 
TGCAATGGAA 
AGAAGGTTGA 
TTGCTCTGGG 
CACACTGGGG 
ATTAAAAGGA 
CTGACAGCTG 
CACTCCATGT 
GGGAGTGACA 
TCTCTCTCTT 
AAGCTTTGCC 
TCTAATAACA 
TGCGTGGAGG 



TGTAGATGAT 
ATGACGGTGA 
GGTTTGGCCT 
GTATTATGAC 
TCCATCCAGT 
ATGTTACAGA 
AACTCCAGGG- 
TTTTCTTTCT 
TTAGTGTTGG 
GTTATTGGCC 
AGGGAATAGG 
GATAGAGTGA 
AGGCTCTGGG 
GTGTCCAAGT 
TAAAAGCCTG 
ATTTGACCTC 
AAGGGGGAGA 
TGTTCTAAGT 
GCTCTGCGTT 
CGTGCTAGGG 
GCTTCGCTCT 
GCAGAAGATT 
ATTGGTGCGG 



AATCATTATC 
TGTATAATGC 
TAATTCAGAC 
AAATAACTCA 
ATGTTAAATT 
ATTGGATGGC 
GGTTGGACCT 
TTAAAAAATG 
TTGTTTGATT 
AGGCGCTTGG 
TCGCTGGCGA 
GTGTGCTTCA 
GCATATATGA 
GTCCTTGCTC 
TGAGGCTAAG 
TATCTCGAGT 
TGTTGGGAGC 
GGTAAAGAAA 
CCCCGTGACG 
GAAGGAAAAT 
CTCTTGGTTC 
CTGGTGTGTG 
AAAGCCGGGT 



AGAGCAGGGT 
TCAAGTATTC 
AGGGTTGGGT 
GGAGGGATGT 
TACTAGGGCA 
TGAATTTGAA 
"GTGGCTGAGC" 
GGTGGGATTA 
GGTGTGTAAG 
AGGACTAGAA 
GTCAGCTGTT 
GGAGGCCGAG 
GGGTATTCTA 
CAGGCT^AAAG 
AGCTAATGGT 
CTGATTAATA 
CGGGCCGAGA 
TAATGTGGGG 
TGAAGTGGGG 
TCTCGTTAAT 
TTGCTGTCTT 
GTGTTGTTCG 
GATAATGATT 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENGE CHARAGTERISTIGS^- 

(A) LENGTH: 1369 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

( ii ) MOLEGULB TYPE : Genomic DNA 
(1X1) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FliAGMENT TYPE: 

(vi) ORIGINAL SOURGE: 

(Xi) SEQUENGE DESCRIPTION : SEQ ID NO: 15: 



TGGGGGATAA 60 

TGGTGGTTTT 120 

CTGTGTGGAG 180 

GAACAAAAGT 240 

GGTAATTTGT 300 
CAGATCCTTC, _ _360/ 

AAAGGATTAG 420 

ATATTATTTG 480 

CTTAAGGGCG 540 

CATGGAGGTT 600 

ATATGGCATG 660 

AGAGTTGGAC 720 

GGGAGACATG 780 

^GACAGGGGAG 84 0 

GTAGATGGGT 900 

TGGGTGGGCT 960 

TTGGGGGTTA 1020 

ATGGGGGGAG 1080 

GGATGGGGTG 1140 

AGGGAGGGGG 1200 

TTGGTGAAGA 1260 

TGGCGGGTCG 1320 

ATCATGTACA 1380 
■ 1400 



GGTCGAGGCA 
AGAACTGGCA 
AGACATTAGT 
ATTGTGTGAA 
TGGCATAATA 
ATAGAAATAA 
ACGTAATAAA 
AATTTATTTT 
ATGTTCAATT 
TCAAATATGT 
TTGGTTGGTT 
CTGTAACGAT 
ATGCGAGAGA 
GTTGGTTGGA 
CAGTGGCTGG 
GTGAAGTGGT 
ACTAfrtCTA 
TTAGATTGTT 
CGTAGAGGGT 
TAGGGATGTT 



GGTTGTGATA 
GGAGGGGCGG 
CAAGGTGGGC 
ATGATAAATG 
AAGAAAGAGA 
ATGATGGTAG 
A ATTA AATTA 
CTTTGTTAAT 
GTTCAGTGTA 
TACAGTTGAT 
AGTTGATATT 
TTTATAAGAT 
GAAGTTGAGG 
AAGTGAGGAT 
ATGTTGGTAG 
GATTGACAGG 
GTAGGTGGTA 
TTGTATAGAT 
TTGTTGAGTG 
GATTTTGTGG 



TGTAGATGAT 
TGGTGTGGTA 
TGATGTGAGG 
TTCAAAGTGA 
AATTATGTAT 
ATAAGAGTAA 
AGAAGGTGTG 
GGTTTATAAT 
ATGAAGAAAT 
TGCGTGTGTT 
ATTAGAAGGT 
AGTTTATTTA 
AGATAAGAGG 
CAAATTCAGG 
GGCTGAAGAT 
ATGGATGGTG 
GGTAGTAGTG 
GGTTGATAGT 
TTATTAAATA 
AGTTCTAGTA 



AATCATTATG 
ATAGATCTTT 
TGCAGATTCA 
CACTGATTGC 
TATATAAAGG 
GTTGAGAGCT 
A ATA TAGTAC 
GTTTTCTGGT 
GTAGTAAATA 
ACTTATGATT 
ACATATTTAT 
TTCATTTCTT 
ACAAATTGGA 
TTGTGAGGCT 
GAAGGTTTGC 
ATTTATTGTT 
TATTTGGAGA 
GTTTCAGCAG 
GTAAGTACAA 
TGATAATAGT 



ACTTTAGGGG 
GGTGAAAAGG 
GGTTAATATG 
CAGAGACAGG 
GTGTTAGAAG 
TAAATTTAAT 
AGTAGGTAAA 
ATTGTGAATT 
TACTTTGCGA 
TTATTATTAT 
TGTGTGAGAT 
ATGTGTGGTG 
AGAGTGAGTT 
TGGGAGCATG 
AGAGAGAGAG 
AGTTTGTATT 
TAGAAGTTAG 
ATATAGAGTT 
ATTAAGTTTA 
CTAGCTTGAT 



TCCTTTCAGT 

CACAGACATG 

AATGTTGGCA 

TGGGGACCTT 

ATGGTTTAGA 

AAAGTGATAT 

TTATTTGATT 

GGACATGGAT 

ACAAGTTGTA 

ATTGATTGCA 

CTTGATTATA 

TGAGGCACAA 

AGGTGCTGCT 

GAGTTTTTAG 

GGTAGACTAA 

CGATGGCTTT 

TGAAAGAAAA 

TTAATCAGGT 

TGGAAAAGAG 

AAATGTGACA"' 



60 : 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200" 
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CACTTATTGG GAATGTTTTT GTTAATAAAA GATTCAGGTG TTACTCTAGG TCAAGAGAAT 1260 

ATTAAACATC AGTCCCAAAT TACAAACTTC AATAAAAGAT TTGACTCTCG AGTGGTGGCA 1320 

ATATAAAGTG ATAATGATTA TCATCTACAT ATCACAACGT GCGTGGAGG 1369 

(2) lOTORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22118 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO . 
V (iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 

GAATTCCCCT ATCCCTAATC CAGATTGGTG GAATAACTTG GTATAGATGT TTGTGCATTA 60 

AAAACCCTGT AGGATGTTCA CTCTAGGTCA CTGTTCAGCA CTGGAACCTG AATTGTGGCC • 120 

CTGAGTGATA GGTCCTGGGA CATATGCAGT TCTGCACAGA CAGACAGACA GACAGACAGA 18 0- 

CAGACAGACA GACAGACGTT ACAAACAAAC ACGTTGAGCC GTGTGCCAAC ACACACACAA 240' 

ACACCACTCT GGCCATAATT ATTGAGGACG TTGATTTATT ATTCTGTGTT TGTGAGTCTG 300- 

TCTGTCTGTC TGTCTGTCTG TCTGTCTGTC TATCAAACCA AAAGAAACCA AACAATTATG 36 0' 

CCTGCCTGCC TGCCTGCCTG CCTACACAGA GAAATGATTT CTTCAATCAA TCTAAAACGA • 420 

CCTCCTAAGT TTGCCTTTTT TCTCTTTCTT TATCTTTTTC TTTTTTCTTT TCTTCTTCCT 4 80 

TCCTTCCTTC CTTCCTTCCT TCCTTCCTTT CTTTCTTTCT TTCTTTCTTT CTTACTTTCT 540 

TTCTTTCCTT CTTACATTl^A TTCTTTTCAT ACATAGTTTC TTAGTGTAAG CATCCCTGAC 600 

TGTCTTGAAG ACACTTTGTA GGCCTCAATC CTGTAAGAGC CTTCCTCTGC TTTTCAAATG 66 0 

CTGGCATGAA TGTTGTACCT CACTATGACC AGCTTAGTCT TCAAGTCTGA GTTACTGGAA 720, 

AGGAGTTCCA AGAAGACTGG TTATATTTTT CATTTATTAT TGCATTTTAA TTAAAATTTA 7 8 0v 

ATTTCACCAA AAGAATTTAG ACTGACCAAT TCAGAGTCTG CCGTTTAAAA GCATAAGGAA 840' 

AAAGTAGGAG AAAAACGTGA GGCTGTCTGT GGATGGTCGA GGCTGCTTTA GGGAGCCTCG 900 

TCACCATTCT GCACTTGCAA ACCGGGCCAC TAGAACCGGG TGAAGGGAGA AACCAAAGCG 960 

ACCTGGAAAC AATAGGTCAC ATGAAGGCCA GCCACCTCCA TCTTGTTGTG CGGGAGTTCA 1020 

GTTAGCAGAC AAGATGGCTG CCATGCACAT GTTGTCTTTC AGCTTGGTGA GGTCAAAGTA 108 0 

CAACCGAGTC ACAGAACAAG GAAGTATACA CAGTGAGTTC CAGGTCAGCC AGAGTTTACA 114 0 

CAGAGAAACC ACATCTTGAA AAAAACAAAA AAATAAATTA AATAAATATA ATTTAAAAAT 1200 

TTAAAAATAG CCGGGAGTGA TGGCGCATGT CTTTAATCCC AGCTCTCTTC AGGCAGAGAT 1260 

GGGAGGATTT CTGAGTTTGA GGCCAGCCTG GTCTGCAAAG TGAGTTCCAG GACAGTCAGG 1320 

GCTATACAGA GAAACCCTGT CTTGAAAACT AAACTAAATT AAACTAAACT AAACTAAAAA . 1380 

AATATAAAAT AAAAATTTTA AAGAATTTTA AAAAACTACA GAAATCAAAC ATAAGCCCAC 1440 

GAGATGGCAA GTAACTGCAA TCATAGCAGA AATATTATAC ACACACACAC ACACAGACTC 1500 

TGTCATAAAA TCCAATGTGC CTTCATGATG ATCAAATTTC GATAGTCAGT AATACTAGAA 1560 

GAATCATATG TCTGAAAATA AAAGCCAGAA CCTTTTCTGC TTTTGTTTTC TTTTGCCCCA 1620. 

AGATAGGGTT TCTCTCAGTG TATCCCTGGC ATCCCTGCCT GGAACTTCCT TTGTAGGTTT 1680 

GGTAGCCTCA AACTCAGAGA GGTCCTCTCT GCCTGCCTGC CTGCCTGCCT GCCTGCCTGC 174 0 

CTGCCTGCCT GCCTGCCTCA CTTCTTCTGC CACCCACACA ACCGAGTCGA ACCTAGGATC 1800 

TTTATTTCTT TCTCTTTCTC TCTTCTTTCT TTCTTTCTTT CTTTCTTTCT TTCTTTCTTT 186 0 

CTTTCTTTCT TTCTTATTCA ATTAGTTTTC AATGTAAGTG TGTGTTTGTG CTCTATCTGC 1920 

TGCCTATAGG CCTGCTTGCC AGGAGAGGGC AACAGAACCT AGGAGAAACC ACCATGCAGC 1980 

TCCTGAGAAT AAGTGAAAAA ACAACAAAAA AAGGAAATTC TAATCACATA GAATGTAGAT 2 040 

ATATGCCGAG GCTGTCAGAG TGCTTTTTAA GGCTTAGTGT AAGTAATGAA AATTGTTGTG 2100 

TGTCTTTTAT CCAAACACAG AAGAGAGGTG GCTCGGCCTG CATGTCTGTT GTCTGCATGT 216 0 

AGACCAGGCT GGCCTTGAAC ACATTAATCT GTCTGCCTCT GCTTCCCTAA TGCTGCGATT 2220 

AAAGGCATGT GCCACCACTG CCCGGACTGA xxTCTTCTTT TTTTTTTTTT TG GAAAATA C . 2280. 

CTTTCTTTCT TTTTCTCTCT CTCTTTCTTC CTTCCTTCCT TTCTTTCTAT XCTTTTTTTC 234 0 

TTTCTTTTTT C TrrT T Tl' TT TTTTTTTTAA AATTTGCCTA AGGTTAAAGG TGTGCTCCAC 2400 

AATTGCCTCA GCTCTGCTCT AATTCTCTTT AAAAAAAAAC AAACAAAAAA AAAACCAAAA 2460 
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TTCAAATTTC TGTGTTCAAG GTCACcJS^ J?Scf^ n^^^"^"" AAACCCTGTG 2S80 

CTACACAGAA.AAACCATATC TCAGAAAAaJ ^t^^^l GAGTTCCAAG TCCGATAGGG 2640 

ACACACACAC ACACACACAC ACA^J^SJJ? JJt^rl^^^ AAACACACAC ACACACACAC 2700 

AAGTCGTGCC TAAAATAAAT aSJt^C^ r^A»^^^^^ CGCGCCGCGG CGATGAGGGG " 2760 

TACTCCTAGA AAaSJ^SJ^J aSIaJSS? JSJ^aI?^ ^^S^^^^ TATGAAGAGG 282^ 
- -AACTCTGAAT TTAGTCTTCG^il^SSSS'Sr??^^ "^^^"^^^^^^ 

ACGGGCGGGC GGGCGGGTGA GTG§§?^S G^rrr^ GAGTGAGGGC GAGCGAGCAG 2940 

ACCCCAAGCG GTAGAGTGTT TtIaAaSS SS^Sfa^^ AAAACAACAA 3000 

CCACCCTCCT CTTCCACTG? SJ^^tc ^S^JJ^S I^^^^S ^AGGTCGCCG 3060 

CTGTGCCTAA CTGTGCCTGT TCGCTCACCC crA^nfS^ ACTGTGCTCC CTTCCCCTAA 3120 

CAAGAACGAT TTTGCCTGTT ^TCACCgSc crrr^^IZn^ CCAGCGACGT ACTTTGACTT 3180 

GTCTAGCCCG TTCGCTATGT TaGcSJcSf rnlS^J^^ TTTCGTTTTT GGGTGCCCGA 3240 

AGTGGTGGGT GGGTACGCTG Sccot?^ rrrn^^^^ CGTTTGTGCC ACTCGGGAGA 3300 

AGACCCTCCG GAGAGACaS SSStoa^t TGCCGGAACC TGAGCTCGGG 3360 

GGTTTGTATG GTTGATCgS JJSJJSJSg ^rtrt^'^ GCGCGTGACG GATCTGTATT 3420 

ACGCTCCAGG CCTCTCAGGT ?SS?gJSS SSS^rfl ^^^2°^°^^ AGTTTCGGGA 3480 

AGGGTGACAG GAGGCCGGGC AAGCAgSS SJScr^rT^ ^Jo^^'^^^*^ TGAGGCGACC 3 540 

GACGGTCTCT AACAAGGAGG TCGTACAOn^ f^fSoJS^'^^ GAGATGGTGT CGTGTTTAAG 3600 

. GCCCTTTTGG GAAAAATG?? ISScJSCTr r^I^^SS^ AGCAGACCGA GTTGCTGTAC 3660 

AGTCCTACCC CCCCcS???? JSJct^ao I^^S^''^^^^ AGAAGGCTTA 3720 

CACCGGGGGC ACCGTACATC TGA^Gnrra^ AAGCCCTCTC TTGTCCCCGT 3780 

T6TGGCTCGG CcSSJJSSJS JS^SSStc? ^^^S^ TCCAAGCCGG 3840 

GAAGCCTTGT CTGTCGCTGT CACCGGgSS? SSIta^^^S Z^^HF™^ TTTTCCTCCA 3 900 

GGGCCGCGGC TTCCAAGCCG GTGTGGr?n^ ^S^fJ^SH, TGAGGCCGAG AGGACGCGAT 3 960 

TTTTTTTTTT TTTTTTTCTr ^I^In^o^ GCCAGCTGGA GCTTCGGGTC T l 1 i 1 rrriT 4020 

; tctgaggS SSSJScS It^^?^'' gcgctg^act toll 

dCTTCGGGTT TTTTTTTTTC CTCCAGAA^ nr^^^S.^ ATGTGGCGGG GCCAGCTGGA 4140 

TACTTCTGAG GCCgJSSSS Cg5S?^? JJSSot^S rnnn^l'^'''' GGGGGCGCTG 42oS 

CTGGAGCTTT GGATCTTTTT TTtStTTTT ^SS^Irff^ ^^^S^^^"^^ GCCCGGTCAG 4260 , 

CGGGGGCACC TTAGATCTGA arrrnlnlnr. CCCTCTCTTG TCCCCGTCAC 4320 

, GGCGGGGCCA. ^^lol^- ?tS???c?? '^ctt^^ AAGCCGATGT. tll^O 

CGTCACCGGG GGCGCTGTAC TTCTGLL-^rn JU^^^ CAGAAGCCCT CTCTTGTCCC 4440 

GGATGTCGCC CGGtSgCTG SIS???^f GATGGGCCCG GGTTCCAGGC 4500 

TCTTGTCCCC GTCACCGGGG iSSScG^f^A l^IHIF^ TTTTCCCTCC AGAAGCCCTC 4560 

GTTCCAAGCC GaSSJSS??? ISS^? ^S^'^^^ ATGGGCCTGT 462? 

CAGAAGCCTT GTCTGTCGCT GtScCCG^ Gcr^r??^ CTTTTTTTTT TTTTTTCCTC 4680 

ATGGGCCCGG CTTCCAAGCG GaTC?SG??? S^crlr^^S^ TCTGAGGCCG AGAGGACGCG 4740 

TTTTTTTTTT TTCCTCCaS Sc^SgJ?? CT^^^r^nf oS^I^^^^ CTTTTTTTTT 4800 

GATGCCGAGA GGACGCGATG GGCCCGTc?? r^^^^^^^?^ CCCGGGGCGC TTGTACTTCT 4860 

TTTGGATCTT TTTTTtJtt? JJScctcS r^^^^^S^^ GTGGCCCGGT CAGCTGGAGC 4920 

CACCTTACAT CTGaSgStA SSSSScS SSSSSrroi^ ^1®^^^^^° TCACCGGGGG 4980 

GTCAGCTGGA GCTTTGGATC TtTTi-^^-W' TTCCAGGCCG ATGTGGCCCG 5040 

ACCGGTGGCA CTG?aS??5 JA^CGgS J^^^^ GAAGCCCTCT TGTCCCCGTC 5100 

GTGGCCCGGT CAG??§SIg? JJJSaJSS gTtTtTrS CCAATCCGAT 5160 

CCTCTTGTCC CTGTCACCGG TGG^rr^ nl^^^IT^ TAATTTTTTC TTCCAGAAGC 5220 

GGCTTCCAGG CCGATGtSS? ??§otS2ct ^l^n^S C^AGAGGACA TTATGGGCCC 5280 

TTTTTCCTCC AGAAGCCCTC T??G??^OTr ^^nn^^ ATCTTTTTTT TTTTTTTTCT 5340 

GGGAAAGCTA TGGG^S^S? tS???5S? JSJSSJS a^^^l^^'t'' ^TGAGGCCGA 5400 

TGTCAGGGTC GACCAGTTGT TCCTTTraP? ^o^S^m^SS ^^CTTATCAG TTCTCCGGGT 5460 

GGGCCACCTC CCCAGGtItG A^SJJS^r r^n^™ TTCGTTATGG GGTCATTTTT 552 0 

TCTCTTTTAT GC^?OTGA?? TOcStc? §????TfTTn TTCCTCCCTG 558 0 

CACGCTGTCC TTTCCCTATT AaS^CtTI^ GACCTGGAGA TAGGTACTGA 5640 

GCTGTTTTGC TTGtcSSS JJSc????? J^rr^^^ AGAGACCCTT TCGATTTAAG 5700 

CTGTCCCCGA GCCACGCt5c CTgS??S? ^^^S^ GTGCCTGAAG 5760. 

GCAGCTTGTG ACAACTSSaC SJctScS ?§?5S?2S«? ^t^n^ CTTGCTGTGG 5820 

CCCGAGGTGT CGTTGTCACA CCTGTCrrrr'^ TCCCGATTTC 5880 " 

GCCACCTTAT TTCGGCTCAC Jto™? GGAGCCAGCT GTGGTTGAGG 5940 

TCTTTTCTCT TCCCGGTC?? tScSS? ll^^nnl^ TTGGAGTCCC GAACCTCCGC 6000 

TTCTTTTTTT TTTTTT^ ^fSSSS? SS???? S^JSgS SJ??^ ^11 
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TGGTGTCCAA GTGTTCATGC CACGTGCCTC CCGAGTGCAC TTTTTTTTGT GGCAGTCGCT 6180 

CGTTGTGTTC TCTTGTTCTG TGTCTGCCCG TATCAGTAAC TGTCTTGCCC CGCGTGTAAG 624*0 

AGATTCCTAT CTCGCTTGTT TCTCCCGATT GCGCGTCGTT GCTCACTCTT AGATCGATGT 6300 

GGTGCTCCGG AGTTCTCTTC GGGCCAGGGC CAAGCCGCGC CAGGCGAGGG ACGGACATTC 6360 

ATGGCGAATG GCGGCCGCTC TTCTCGTTCT GCCAGCGGGC CCTCGTCTCT CCACCCCATC 642 0 

CGTCTGCCGG TGGTGTGTGG AAGGCAGGGG TGCGGCTCTC CGGCCCGACG CTGCCCCGCG 6480 

CGCACTTTTC TCAGTGGTTC GCGTGGTCCT TGTGGATGTG TGAGGCGCCC GGTTGTGCCC .6540 

TCACGTGTTT CACTTTGGTC GTGTCTCGCT TGACCATGTT CCCAGAGTCG GTGGATGTGG 6600 

CCGGTGGCGT TGCATACCCT TCCCGTCTGG TGTGTGCACG CGCTGTTTCT TGTAAGCGTC 6660 

GAGGTGCTCC TGGAGCGTTC CAGGTTTGTC TCCTAGGTGC CTGCTTCTGA GCTGGTGGTG 6720 

GCGCTCCCCA TTCCCTGGTG TGCCTCCGGT GCTCCGTCTG GCTGTGTGCC TTCCCGTTTG. . 6780 

TGTCTGAGAA GCCCGTGAGA GGGGGGTCGA GGAGAGAAGG AGGGGCAAGA CCCCCCTTCT' 684 0 

TCGTCGGGTG AGGCGCCCAC CCCGCGACTA GTACGCCTGT GCGTAGGGCT GGTGCTGAGC 6900 

GGTCGCGGCT GGGGTTGGAA AGTTTCTCGA GAGACTCATT GCTTTCCCGT GGGGAGCTTT . 6 960 

GAGAGGCCTG GCTTTCGGGG GGGACCGGTT GCAGGGTCTC CCCTGTCCGC GGATGCTCAG 7020,. 

AATGCCCTTG GAAGAGAACC TTCCTGTTGC CGCAGACCCC CCCGCGCGGT CGCCCGCGTG 7080 

TTGGTCTTCT GGTTTCCCTG TGTGCTCGTC GCATGCATCC TCTCTCGGTG GCCGGGGCTC 7140 

GTCGGGGTTT TGGGTCCGTC CCGCCCTCAG TGAGAAAGTT TCCTTCTCTA GCTATCTTCC 72 00 

GGAAAGGGTG CGGGCTTCTT ACGGTCTCGA GGGGTCTCTC CCGAATGGTC CCCTGGAGGG 7260 

CTCGCCCCCT GACCGCCTCC CGCGCGCGCA GCGTTTGCTC TCTCGTCTAC CGCGGCCCGC 7320 

GGCCTCCCCG CTCCGAGTTC GGGGAGGGAT CACGCGGGGC AGAGCCTGTC TGTCGTCCTG 73 80 

CCGTTGCTGC GGAGCATGTG GCTCGGCTTG TGTGGTTGGT GGCTGGGGAG AGGGCTCCGT 7440 

GCACACCCCC GCGTGCGCGT ACTTTCCTCC CCTCCTGAGG GCCGCCGTGC GGACGGGGTG , 7500 

TGGGTAGGCG ACGGTGGGCT CCCGGGTCCC CACCCGTCTT CCCGTGCCTC ACCCGTGCCT 7560 

TCCGTCGCGT GCGTCCCTCT CGCTCGCGTC CACGACTTTG GCCGCTCCCG CGACGGCGGC 7620 

CTGCGCCGCG CGTGGTGCGT GCTGTGTGCT TCTCGGGCTG TGTGGTTGTG TCGCCTCGCC 768 0.; 

CCCCCCTTCC CGCGGCAGCG TTCCCACGGC TGGCGAAATC GCGGGAGTCC TCCTTCCCCT 7740 

* CCTCGGGGTC GAGAGGGTCC GTGTCTGGCG TTGATTGATC TCGCTCTCGG GGACGGGACC 7800 

'GTTCTGTGGG AGAACGGCTG TTGGCCGCGT CCGGCGCGAC GTCGGACGTG GGGACCCACT 7860 

GCCGCTCGGG GGTCTTCGTC GGTAGGCATC GGTGTGTCGG CATCGGTCTC TCTCTCGTGT 7920 

CGGTGTCGCC TCCTCGGGCT CCCGGGGGGC CGTCGTGTTT CGGGTCGGCT CGGCGCTGCA 7980 

GGTGTGGTGG GACTGCTCAG GGGAGTGGTG CAGTGTGATT CCCGCCGGTT TTGCCTCGCG 8040 

TGCCCtGACC GGTCCGACGC CCGAGCGGTC TCTCGGTCCC TTGTGAGGAC CCCCTTCCGG , 8100,' 

GAGGGGCCCG TTTCGGCCGC CCTTGCCGTC GTCGCCGGCC CTCGTTCTGC TGTGTCGTTC 816.0 

CCCCCTCCCC GCTCGCCGCA GCCGGTCTTT TTTCCTCTCT CCCCCCCTCT CCTCTGACTG . 8220 

ACCCGTGGCC GTGCTGTCGG ACCCCCCGCA TGGGGGCGGC CGGGCACGTA CGCGTCCGGG 8280 

CGGTCACCGG GGTCTTGGGG GGGGGCCGAG GGGTAAGAAA GTCGGCTCGG CGGGCGGGAG 834 0 

GAGCTGTGGT TTGGAGGGCG TCCCGGCCCC GCGGCCGTGG CGGTGTCTTG CGCGGTCTTG 8400 

GAGAGGGCTG CGTGCGAGGG GAAAAGGTTG CCCCGCGAGG GCAAAGGGAA AGAGGCTAGC 846 0 

AGTGGTCATT GTCCCGACGG TGTGGTGGTC TGTTGGCCGA GGTGCGTCTG GGGGGCTCGT 8520 

CCGGCCCTGT CGTCCGTCGG GAAGGCGCGT GTTGGGGCCT GCCGGAGTGC CGAGGTGGGT 8580 

ACCCTGGCGG TGGGATTAAC CCCGCGCGCG TGTCCCGGTG TGGCGGTGGG GGCTCCGGTC 8640 

GATGTCTACC TCCCTCTCCC CGAGGTCTCA GGCCTTCTCC GCGCGGGCTC TCGGCCCTCC 8700 

CCTCGTTCCT CCCTCTCGCG GGGTTCAAGT CGCTCGTCGA CCTCCCCTCC TCCGTCCTTC 8760 

CATCTCTCGC GCAATGGCGC CGCCCGAGTT CACGGTGGGT TCGTCCTCCG CCTCCGCTTC 8820 

TCGCCGGGGG CTGGCCGCTG TCCGGTCTCT CCTGCCCGAC CCCCGTTGGC" GTGGTCTTCT 8880 

CTCGCCGGCT TCGCGGACTC CTGGCTTCGC CCGGAGGGTC AGGGGGCTTC CCGGTTCCCC 8940 

GACGTTGCGC CTCGCTGCTG TGTGCTTGGG GGGGGCCCGC TGCGGCCTCC GCCCGCCCGT .9000 

GAGCCCCTGC CGCACCCGCC GGTGTGCGGT TTCGCGCCGC GGTCAGTTGG GCCCTGGCGT 9060 

TGTGTCGCGT CGGGAGCGTG TCCGCCTCGC GGCGGCTAGA CGCGGGTGTC GCCGGGCTCC 9120 

GACGGGTGGC CTATCCAGGG CTCGCCCCCG CCGACCCCCG CCTGCCCGTC CCGGTGGTGG 9180 

TCGTTGGTGT GGGGAGTGAA TGGTGCTACC GGTCATTCCC TCCCGCGTGG TTTGACTGTC , 9240 

TCGCCGGTGT CGCGCTTCTC TTTCCGCCAA CCCCCACGCC AACCCACCAC CCTGCTCTCC 93 00 

CGGCCCGGTG CGGTCGACGT TGCGGCTCTC CCGATGCCGA GGGGTTCGGG ATTTGTGCCG 9360 

GGGACGGAGG GGAGAGCGGG TAAGAGAGGT GTCGGAGAGC TGTCCCGGGG CGACGCTCGG 9420 

GTTGGCTTTG CCGCGTGCGT GTGCTCGCGG ACGGGTTTTG TCGGACCCCG ACGGGGTCGG ' 94 80 

TCCGGCCGCA TGCACTCTCC CGTTCCGCGC GAGCGCCCGC CCGGCTCACC CCCGGTTTGT 954 0 

CCTCCCGCGA GGCTCTCCGC CGCCGCCGCC TCCTCCTCCT CTCTCGCGCT CTCTGTCCCG 9600 

CCTGGTCCTG TCCCACCCCC GACGCTCCGC TCGCGCTTCC TTACCTGGTT GATCCTGCCA 9660 

GGTAGCATAT GCTTGTCTCA AAGATTAAGC CATGCATGTC TAAGTACGCA CGGCCGGTAC. 9720 

AGTGAAACTG CGAATGGCTC ATTAAATCAG TTATGGTTCC TTTGGTCGCT CGCTCCTCTC 97 80 
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CTACTTGGAT 
TCCCGGGGGG 
CTCCGGCCGG 
ACGCCCCCCG 
TCGCCGTGCC 
GAGCCTGAGA 
CGACCCGGGG 
- ~ GGAATGAGTC 
CAGCCGCGGT 
TAGTTGGATC 
, , CCCCTTGCCT 
GTTTACTTTG 
AGGAATAATG 
TAAGAGGGAC 
GCAAGACGGA 
TCGGAGGTTC 
CGATGCGGCG 
GGTTCCGGGG 
CCAGGAGTGG 
CGGACAGG AT 
TTCTTAGTTG 
CTAACTAGTT 
GCGTTCAGCC 
TGCACGCGCG 
AACCCGTTGA 
GAATTCCCAG 
ACCGCCCGTC 
* GGTCGGCCCA 
^ AGTAAAAGTC 
CTGTGGAGGA 
CGCGTGCGTC 
GAAGGGGTGG 
TCCCCTCTCC 
GCGTCTTGCC. 
GGTTTTTGAC 
CCCATCCCCG 
GGATGTGAGT 
GTCCTCCCCG 
CCCCGGGGGG 
CGGTCGTTCG 
CCCGAGGCGG 
CCCGACCCGC 
GGGTTCCCGT 
CACGTGTCTC 
CCTCTCTCTC 
CGTGAGTTCG 
TGCGTCGATG 
CATCGACACT 
CGTCGGTTGA 
CTCGCAGGGC 
GGGCGGTTGT 
CGCGCTCGCG 
GCCTCGCGTC. 
TGGGAACCCA 
GAGGTTGGCG 
GGTTGTCGGG 
GTTTGGGTCT 
GGCGCCGCGC 
GTATCCCCGG 
CCTCGGTGGG 
CGTGGCTCTT 



AACTGTGGTA 
GGATGCGTGC. 
GGGTCGGGCG 
TGGCGGCQAC 
TACCATGGTG 
AACGGCTACC 
AGGTAGTGAC^ 
"CACTTTAAAT 
AATTCCAGCT 
TTGGGAGCGG 
CTCGGCGCCC 
AAAA?^TTAG 
GAATAGGACC 
GGCCGGGGGC 
CCAGAGCGAA 
GAAGACGATC 
GCGTTATTCC 
GGAGTATGGT 
GCCTGCGGCT 
TGACAGATTG 
GTGGAGCGAT 
ACGCGACCCC 
ACCCGAGATT 
CTACACTGAG 
ACCCCATTCG 
TAAGTGCGGG 
GCTACTACCG 
CGGCCCTGGC 
GTAACAAGGT 
GCGGCGGCGT 
CCGGGTCCCG 
GTGGGGTCGG 
CTCGTCCGGC 
TCTTTCCCGT 
CCGTCCCGGG 
CCGCGGCTCT 
GTCGCGTGTG 
CTCCTGTCCC 
GTCGCCCTGC 
GGCGGCTCTC 
CGGTCGTGTG 
GCCGCCGGCT, 
GTCGTTCCCG 
GTTTCGTTCC 
CGGGGAGAGG 
CTCACACCCG 
AAGAACGCAG 
TCGAACGCAC 
CGATCAATCG 
CAACCCCCCA 
CGGTGTGGGG 
GCTTCTTCCC 
GGCGCCTCCC- 
CCGCGCCCCC 
GTTGAGGGTG 
GTGGCGGTCG 
TGCGCTGGGG 
-ACCCTCCGGC 
TGGCGTTGCG 
CGCCTTCGCG 
CTTCGTCTCC 



ATTCTAGAGC 
ATTTATCAGA 
CCGGCGGCTT 
GACCCATTCG 
ACCACGGGTG 
ACATCCAAGG 
GAAAAATAAC- 
CCTTTAACGA 
CCAATAGCGT 
6CGGGCGGTC 
CCTCGATGCT 
AGTGTTCAAA 
GCGGTTCTAT 
ATTCGTATTG 
AGCATTTGCC 
AGATACCGTC 
CATGACCCGC 
TGCAAAGCTG 
TAATTTGACT 
ATAGCTCTTT 
TTGTCTGGTT 
CGAGCGGTCG 
GAGCAATAAC 
TGGCTCAGCG 
TGATGGGGAT 
TCATAAGCTT 
ATTGGATGGT 
GGAGCGCTGA 
TTCCGTAGGT 
GGCCCGCTCT 
TCGCCCGCGT 
TCTGGGTCCG 
TCTGACCTCG 
CCGGCTCTTC 
6GCGTTCGGT 
GGCTTTTCTA 
GGCTCGCCCG 
GGGTACCTAG 
CGCCCCCAGG 
CCTCAGACTC 
GGGGGGTGGA 
TGCCCGATTT 
TGTTTTTCCG 
TGCTGGCCGG 

AAATACCGAT 
CTAGCTGCGA 
TTGCGGCCCC 
CGTCACCCGC 
ACCCGGGTCG 
CGCGCGCCCG 
GCTCCGCCGT 
GGACCGCTGC 
GTGGCGCCCG 
TGCGTGCGCC 
ACGAGGGCCG 
_GAGG.CGGGGT« 
TTGTGTGGAG 
AGGGAGGGTT 
CCGCACGCGG 



TAATAGATGC 
TCAAAACCAA 
GGTGACTCTA 
AACGTCTGCC 
ACGGGGAATC 
AAGGCAGCAG 
AATACAGGAC- 
GGATCCATTG 
ATATTAAAGT 
CGCCGCGAGG 
CTTAGCTGAG 
GCAGGCCCGA 
TTTGTTGGTT 
CGCCGCTAGA 
AAGAATGTTT 
GTAGTTCCGA 
CGGGCAGCTT 
AAACTTAAAG 
CAACACGGGA 
CTCGATTCCG 
AATTCCGATA 
GCGTCCCCCA 
AGGTCTGTGA 
TGTGCCTACC 
CGGGGATTGC 
GCGTTGATTA 
TTAGTGAGGC 
GAAGACGGTC 
GAACCTGCGG 
CCCCGTCTTG 
GTGGAGCGAG 
TCTGGGACCG 
CCACCCTACC 
CGTGTCTACG 
CGTCGGGGCG 
CGTTGGCTGG 
TCCCGATGCC 
CTGTCGCGTT 
GTCGGGGGGC 
CATGACCCTC 
TGTCTGGAGC 
CCGCGGGTCG 
CTCCCGACCC 
CCTGAGGCTA 
TCGTTGGGGG 
ACGACTCTTA 
GAATTAATGT 
GGGTTCCTCC 
TGCGGTGGGT 
GGCCCTCCGT 
CGTCGCGGAG 
TCCCGCCCTC 
CTCACCAGTC 
GGGGTGGGCG 
GAGGTGGTGG 
GTCGGTCGCC 
-CGACCGCTCG 
GGAGAGCGAG 
TGGCGTCCCG 
CCGCTAGGGG 
CACCCGGGCG- 



CGACGGGCGC 
CCCGGTGAGC 
GATAACCTCG 
CTATCAACTT 
AGGGTTCGAT 
GCGCGCAAAT 
TCTTTCGAGG" 
GAGGGCAAGT 
TGCTGCAGTT 
CGAGTCACCG 
TGTCCCGCGG 
GCCGCCTGGA 
TTCGGAACTG 
GGTGAAATTC 
TCATTAATCA 
CCATAAACGA 
CCGGGAAACC 
GAATTGACGG 
AACCTCACCC 
TGGGTGGTGG. 
ACGAACGAGA 
ACTTCTTAGA 
TGCCCTTAGA 
CTGCGCCGGC 
AATTATTCCC. 
AGTCCCTGCC 
CCTCGGATCG 
GAACTTGACT 
AAGGATCATT 
TGTGTGTCCT 
GTGTCTGGAG 
CCTCCGATTT . 
GCGGCGGCGG 
AGGGGCGGTA 
CGCGCTTTGC 
GGCGGTTGTC 
ACGCTTTTCT 
CCGGCGCGGA 
GGTGGGGCCC 
CTCCCCCCGC 
CCCCTCGGGC 
GTCCTGTCGG 
TTTTTTTTTC 
CCCCTCGGTC 
ACTGTGCCGT 
GCGGTGGATC 
GAATTQCAGG 
CGGGGCTACG 
GCTGCGCGGC 
CTCCCGAAGT 
CCTGGTCTCC 
GCCCGTGCAC 
TTTCTCGGTC 
CGTCCGCATC 
TCGGTCCCCT 
TGCGGTGGTT 
GGGGGTTGGC 
GGCGAGAACG 
CGTCCGTCCG 
CGGTCGGGGC 
GTACCCGCTC 



TGACCCCCCT 
TCGCTCCCGG 
GGCCGATCGC 
TCGATGGTAG 
tCCCSGAGAGG 
^TACCCACTCC_ 
CCCTGTAATT 
CTGGTGCCAG 
AAAAAGCTCG 
CCCGTCCCGG 
GGCCCGAAGC 
TACCGCAGCT 
AGGCCATGAT . 
TTGGACCGGC 
AGAACGAAAG 
TGGCGACTGG 
AAAGTCTTTG 
AAGGGCACCA 
GGCCCGGACA 
TGCATGGCCG 
CTCTGGCATG 
GGGACAAGTG 
TGTCCGGGGC 
AGGCGCGGGT 
CATGAACGAG 
CTTTGTACAC 
GCCCCGCCGG 
ATCTAGAGGA 
AAACGGGAGA 
CGCCGGGAGG 
TGAGGTGAGA 
CCCCTCCCCC 
CTGCTGGCGG 
CGTCGTTACG 
TCTCCCGGCA , 
GCGTGTGGGG 
GGCCTCGCGT 
GGTTTAAGGA 
GTAGGGAAGT 
TGCCGCCGTT 
GCCGTGGGGG 
TGCCGGTCGT ' 
CTCCCCCCCA 
CATCTGTTCT 
CGTCAGCACC 
ACTCGGCTCG 
ACACATTGAT 
CCTGTCTGAG. 
TGGGAGTTTG 

tcagacgtgt: 
cccgcgcatc 

CCCGGTCCTG 
CCGTGCCCCG 
TGCTCTGGTC- 
GCGGCCGCGG 
GTCTGTGTGT 
GCGGTCGCCC 
GAGAGAGGTG 
TCCCTCCCTC 
CCGTGGCCCC 
CGGGGCGGGC - 



9840 

9900 

9960 

i0020i 

10080 

10140^ 

10200" 

10260 

10320 

10380 

/ 10440 

10500- 

10560 

10620 

10680 

10740 

10800 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11460 

11520 

11580 

11640 

11700 

11760 

11820 

11880 , 

11940 

12000 

12060 

12120 

12180 

12240 

12300 

12360 

- 12420 

12480 

12540 

12600 

12660 

12720 

12780 

12840 . 

12900 

12960 

13020 

13080 

13140^ 

13200 

13260 

,13320 

13380 ; 
13440- ~ 
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CCGCGGGACG 
GGGAGGGAGA 
CTGTGGGCTG 
CCCTCCCGCC 
GCCGGGTGCC 
TGTCeCCCCT* 
ATTAGTCAGG 
GAAGAGCCCA 
GACCCACTCC 
. TGGACGGTGT 
GTTGCTTGGG 
CGAGACCGAT 
TCAAGAGGGC 
GATTCAACCC 
CCCCGTTCCT 
GCCTCCGGCG 
GGGTCGGCGG 
GGCGGTGCGC 
GGGGGGGCGG 
GGCCGCGCTT 
CTCTCCCCCC 
GGCGCGACCG 
CGGACTGTCC 
GTCACGCGTC 
CGACCCGTCT 
GAAAGCCGCC 
CGAGGCCTCT 
AGGTGGAGCA 
CGAAGCCAGA 
CGACCTGGGT 
TTTGCCTCAG 
AATGATTAGA 
AGAAGCCCGG 
TTGGTAAGCA 
GACGCTCATC 
GAAGTCGGAA 
AATGGATGGC 
GGACGGGAGC 
CCCCCGCCTC 
GGCCGCGACG 
TGGAGCCGCC 
AGGCCGAAST 
AGAGATGGGC 
CGAAAGGGAG 
CAGTGCGGTA 
TTTCTTTGTG 
TTGGAAAGCG 
GGGGAGAGGG 
GAACAGCCTC 
ACTTCGGGAT 
CTGGGCGCGC 
CCCGTCCTTT 
CGTCGTCGCC 
CGCGCGGCGC 
ACCAGCGGTC 
CTCTGGACGC 
GCTCCCGGGG 
CCCCCCATCG 
GGGGGAACCT 
CGCGGCGCCC 
GAATCCGACT 



CCGCGGCGTC 
GGGCCTCGCT 
TGCGTCCCGG 
GGCCTCTCGG 
GTCTCTTTCC 
TTCTGACCGC 
GGAGGAAAAG 
GCGCCGAATC 
CCGGCGCCGC 
GAGGCCGGTA 
AATGCAGCCC 
AGTCAACAAG 
GTGAAACCGT 
GGCGGCGCGC 
CCCGACCCCT 
GGGGGGGCGG 
GGGACCGCCC 
CGCGACCGGC 
CGCGTCTCAG 
TCGCCGAATC 
GTCCGCCTCC 
CTCTCCCACC 
CCAGTGCGCC 
TCCCGACGAA 
TGAAACACGG 
GTGGCGCAAT 
CCAGTGCGCC 
CGAGCGTACG 
GGAAACTCTG 
ATAGGGGCGA 
GATAGCTGGC 
GGTCTTGGGG 
CTCGCTGGCG 
GAACTGGCGC 
AGACCCCAGA 
TCCGCTAAGG 
GCTGGAGCGT 
GGCCGCGGGT 
CCCTCCGCGC 
AGTAGGAGGG 
GCAGGTGCAG 
GGAGAAGGGT 
GAGTGCCGTT 
fCGGGTTCAG 
ACGCGACCGA 
AAGGGCAGGG 
TCGCGGTTCC 
TGTAAATCTC 
TGGCATGTTG 
AAGGATTGGC 
GCCGCGGCtG 
CCGCCCGGGC 
ACCTCTCTTC 
GGGCTCCGGG 
CCCGGTGGGG 
GAGCCGGGCC 
AGCCCGGCGG 
CCTCTCCCGA 
CCGCGTCGGT 
CCGCCTCGGC 
GTTTAATTAA 



CGTGCGCCGA 
GACCCGTTGC 
GGGTTGCGTG 
GGACCCCCTG 
CGCCCGCCTC 
GACCTCAGAT 
AAACTAACCA 
CCCGCCGCGC 
TCGtGGGGGG 
GCGGCCCCGG 
AAAGCGGGTG 
TACCGTAAGG 
TAAGAGGTAA 
GTCCGGCCGT 
CCACCCGCGC 
GGGGTGGTGT 
CCGGCCGGCG 
TCCGGGACGG 
GGCGCGCCGA 
CCGGGGCCGA 
CGGGCGGGCG 
CCCCTCCGTC 
CCGGGCGTCG 
GCCGAGCGCA 
ACCAAGGAGT 
GAAGGTGAAG 
GAGGGCGCAC 
CGTTAGGACC 
GTGGAGGTCC 
AAGACTAATC 
GCTCTCGCTC 
CCGAAACGAT 
TGGAGCCGGG 
TGCGGGATGA 
AAAGGTGTTG 
AGTGTGTAAC 
CGGGCCCATA 
GCGCGTCTCT 
GCCGGGTTCG 
CCGCTGCGGT 
ATCTTGGTGG 
TCCATGTGAA 
CCGAAGGGAC 
ATCCCCGAAT 
TCCCGGAGAA 
CGCCCTGGAA 
GGCGGCGTCC 
GCGCCGGGCC 
GAACAATGTA 
TCTAAGGGCT 
GACGAGGCGC 
CCGCCCTCCC 
CCCCCTCCTT 
GCGGCGGGTC 
CGGGGGGCCC 
CTTCCCGTGG 
GTGCCGGCGC 
GGTGCGTGGC 
GtTCCCCCGC 
CGGCGCCTAG 
AACAAAGCAT 



TGCGAGTCAC 
GTCCCGGCTT 
TGAGTAAGAT 
AGACGGTTCG 
CTCGCTCTCT 
CAGACGTGGC 
GGATTCCCTC 
GTCGCGGCGT 
CCCAAGTCCT 
CGCGCCGGGC 
GTAAACTCCA 
GAAAGTTGAA 
ACGGGTGGGG 
GCCCGGTGGT 
GTCGTTCCCC 
GGTGGTGGCG 
ACCGGCCGCC 
CCGGGAAGGC 
ACCACCTCAC 
GGAAGCCAGA 
TGGGGGTGGG 
GCCTCTCTCG 
TCGCGCCGTC 
CGGGGTCGGC 
CTAACGGGTG 
GGCCCCGCCC 
CACGGGCCCG 
CGAAAGATGG 
GTAGCGGTCC 
GAACCATCTA 
CCGACGTACG 
CTCAACCTAT 
CGTGGAATGC 
ACCGAACGCC 
GTTGATATAG 
AACTCACCTG 
CCCGGCCGTC 
CGGGGTCGGG 
CCCCCGCGGC 
GAGCCTTGAA 
TAGTAGCAAA 
CAGCAGTTGA 
GGGCGATGGC 
CCGGAGTGGC 
GCCGGCGGGA 
TGGGTTCGCC 
GGTGAGCTCT 
GTACCCATAT 
GGTAAGGGAA 
GGGTCGGTCG 
CGCCGCGCTC 
CTCTTCCCCG 
CTTCCCGTGG 
CAACCCCGGG 
GGACACTCGG 
ATCGCCTGAG 
GGGTCCCCTC 
GGGGGCGGGC 
CGGGTCCGCC 
CAGCCGACTT 
CGCGAAGGCC 



CCCCGGGTGT 

CCCTGGGGGG 

CCTCCACCCC 

CCGGCTCGTC 

TCTTCCCG CG. 

GACCCGCTGA 

AGTAACGGCG 

GGGAAATGTG 

TCTGATCGAG 

TCGGGTCTTC 

TCTAAGGCTA 

AAGAACTTTG 

TCCGCGCAGT 

CCCGGCGGAT 

TCTTCCTCCC 

CGCGGGCGGG 

GCCGGGCGCA 

CCGGTGGGGA 

CCCGAGTGTT 

TACCCGTCGC 

GGCCGGGCCG 

GGGCCCGGTG 

GGGTCCCGGG 

GGCGATGTCG 

CGCGAGTCAG 

GGGGGCCCGA 

TCTCGCCCGC 

TGAACTATGC 

TGACGTGCAA 

GTAGCTGGTT 

CAGTTTTATC 

TCTCAAACTT 

GAGTGCCTAG 

GGGTTAAGGC 

ACAGCAGGAC 

CCGAATCAAC 

GCCGCAGTCG 

GGTGCGTGGC 

GTCGGGCCCC 

GCCTAGGGCG 

TATTCAAACG 

ACATGGGTCA 

CTCCGTTGCC 

GGAGATGGGC 

GGCCTCGGGG 

CCGAGAGAGG 

CGCTGGCCCT 

CCGCAGCAGG 

GTCGGCAAGC 

GGCTGGGGCG 

TCCCACGTCC 

CGGGGCCCCG 

GGGGGCGGGT 

GGGGTTCCGG 

GGGGCCGGCG 

CTGCGGCGGG 

CCGGCGGGGC 

GGGCGTGTCC 

CCCCGGGCCG 

AGAACTGGTG 

CGCGGCGGGT 



TGCGAGTTCG 

GACCCGGCGT 

CGCCGCGCTC 

CTCCCGTGCG 

GCTGGGCGCG 

ATTTAAGCAT . 

AGTGAACAGG 

GCGTACGGAA 

GGCCAGCCCG 

CCGGAGtCGG 

AATACCGGCA 

AAGAGAGAGT 

CCGCCCGGAG 

CTTTCCCGCT 

CGCGTCCGGC 

GCCGGGGGTG 

CTTCCACCGT 

AGGTGGCTCG 

ACAGCCCTCC 

CGCGCTCTCC 

CCCCTCCCAC 

GGGGGCGGGG 

GGGACCGTCG 

GCTACCCACC 

GGGCTCGTCC 

GGTGGGATCC 

CGCGCCGGGG 

TTGGGCAGGG 

ATCGGTCGTC 

CCCTCCGAAG 

CGGTAAAGCG 

TAAATGGGTA 

TGGGCCACTT 

GCCCGATGCC 

GGTGGCCATG 

TAGCCCTGAA 

GAACGGAACG 

GGGGGCGGGT 

GCGGAGCCTA 

CGGGCCCGGG 

AGAACTTTGA 

GTCGGTCCTG 

CTCGGCCGAT . 

GCCGCGAGGC 

AGAGTTCTCT 

GGCCCGTGCC 

TGAAAATCCG 

TCTCCAAGGT 

CGGATCCGTA 

CGAAGCGGGG 

GGGGAGACCC 

TCGTCCCCCG 

CGGGGGTCGG 

AGCGGGAGGA 

GCGGCGGCGA 

CGTCGCGGCC 

CTCGCTCCAC 

CGCGCGTGTG 

CGGTTTTCCG 

CGGACCAGGG 

GTTGACGCGA 



13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
14700 
14760 
14820 
14880 
14 94 0; 
15000^ 
15060 
15120.. 
15180- 
15240 
15300. 
15360 
15420^ 
15486' 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
' 16080 
16i40 
16200 
16260 
16320 
16380 
16440 
16500 
16560 
16620 
16680 
16740 
16800 
16860 
16920 
16980 
17040 
17100 
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^S^ ^C^^^ X71S0 

CGCGCATGAA TGGATGAACG aSIJJ^JJJ? JSJcSJirr ^^^CGTCATC TAATTAGTGA £722? 

GCCAAGGGAA CGGGCTTGGC GGAATCAGCG Ar^af^J^^S TACTATCCAG CGAAACCACA 17280 

AGTCTGGCAC GGTGAAG^A StcISSS? CTA^^t^^ ACCGTGTTGA GCTTGACTCT llllo 

GCCCCGTCCT CGCGTCGGgS JSS^^aS J^G^^^JJJn 1^^^^"==^ ^^^^^^^C^^G ^74?? 

CTACTCTCAT CGTTTTTTCA CTGACCCGGT rArr^Sl^SS ^GGCGGCCGG TGAAATACCA 17460 

- - cgcttctggc-gccaagcVtc CG?c??S?G^^ itIIo" 

GGGGAGTTTG SSgS^^gS JlSJSS^r fff^^S^P CTCCGGGGAC 1718? 
TAAGGCGAGC TCAGGGAGGA CAGAAACOTC CCr^nnlnn^ f^^^^^^ GCAGGTGTCC 17640 
ATCTTGATTT TCAGTACGAA ScAgK^S? SS^i^ 2^°*^*=^ AGCTCGCTTG 17700 
TTTGGGTTTT AAGCAGGAGG TGTcSSS?! OT^J^S^r nilS^'^'^^^ CTTCTGACCT: 17760 
CCAAGCGTTC ATAGCGACGT CGCTTTTTGA Tm^A^? SS^^^^®^ CTTGTGGCGG 17820 
GAAGCAGAAT TCACCAAGCG TTgStTgS SSScT^f I^^^S^^^^ CTATCATTGT 17880 
TAGACCGTCG TGAGACAGGT TAGTTTTACC SSSa^^ ^^SSS^^'^T GAGCTGGGTT 17940 
CCTGCTCAGT ACGAGAGGAA CCGCAgItTC A^t^I^^ TGTGTTGTTG CCATGGTAAT 18000 
GCCAATGGGG CGAAGCTaSJ S?Sg??SSA ^JaSSSS IrrrnS^S"^ TGGCTGAGGA 18065 
GCCCAAGCGG AACGATACGG CAGCGCCgS GGA^S^r^ ^oSH^^^^ GTCAGAATCC 18120 
CCCCGTCCGT CCCGCTCGGC GGGGTCCCCG ^T^r^n^ J^^^'^^^^^ ATAGCCGGGT 18180 
CCGCCGGGCG TCGGGACCGG GGT??GrT^o CGGCGGCGCG GGGTCTCCCC 18240 

CGGCCGGAAA GGGgSJg?? S^S^S^^^S JScr??^^! TCGTCTTGGG AAACGGGGTG lllto 
GGCGCTAAAC CATTCGTAGA SaJSSS?? ^crr^nf^ CGCACGTTCG TGTGGAACCT 18360 
GCTCCCTCGC TGCGATCtS ?????SaS aI^J^S^^^*" TAGCAGAGCA 18420 

TTTCCCGTCG CACGCCCGCT CGCTCGCACG csf r^^^^o ^^^^^^"^ CTCTGCGGGC 18480 
GCGGTCGCCT CGGCCCCCGC G^Sttc??? If^nnn™ S^CGCCCGGG CGTCACGGGG 18540 
CGTCTTCTCC TCCGTCTCCC SggISSS GTGGTGGTTG GGGGGGGGAT 18600 

TGGGTGTGGG AGCCTCGTGC CGTCGCGaS Gr^^^^ TCCCCTTCGG TCGCTCTCCT 18660 • 
. CTTGCCCTeC GGCCTTGGCC S^^fS^f^S GTCGCCTGCC GCCGCAGCCC 18720 

CGGCGGCGCG GTGACGCAcS otSSSJJ?? ^^^^^^ GCGGCGGCGA 1878? 

GTTGGAGGGG GGGGAGGGGT TTTTCCCGTr aar-^^^SSS GCGTCCGTCG GGGACGGCCG 18840 
GCCGGGGGGG CGCTCTC??? SSS^I? SJcSSSr rn^^SS"^" GCCTCTGGCG IslJS 
GCGGCGGCGA CGTGCGTACG AGGcra^r^J ^SSz^S^*^^^ GCCCCTCCTC TTCGCGCGCC 18 960 
GCGGCGCCTC TTCCATTTTT TC?^?^aI GGAGGCGGAG AGGGTCCGGC 19020 

ACTTTGTTTT TTT^TTTCC CCCgS^^ '^n^'"'^ CGACCAGTAC TCCGGGCGAC llSsS 
CCCCCCCCCC CCCCcSg?S §§SS?^Jg S?rrf^^S AGATGTCCGA AAGTGTCCCC Iii40 
TT l lir iTO TTAAATTCCT GGAaJ^JJJS Sr^^»J GGACTCTTTT TTTTTTTTTT 19200 
^TATAGGTC GACCA6TACT cSSSJSJa "^^ACTCCTT 19260 

GACCAGATAT CCGAAAGTCC TCTCTTTCC? TTtZ^^^^^ 11^^®^^^^ CCCAGAGGTC 19320 
TTTTTTTTTT TTTGGTGTGC CTcStSSa ^a^SIJ? CCCACAGCGA TTCTCTTTTT 19380 
ATATACTTAT AGGAGGAGGT ^gSSSJSJ T^S^rr^ 1^1^'^'^'^ GTGTACGTTT 19440 
TCCACCGATG ATGGAGGTCG AcSStgtc ^rf^S?^^ ACTTTGTTTT TTTTTTTTTT 19500 
CCGCGACGCG GCGGGCTCAC TCTOrA^^nS SS^^^SIE^ CCGTCCCCCC CCTCCCCCCC 19560 
TGGAACCTTA AG2??Sgf gSStcSJJ JJJSS^J TTTTTl-rTTT TTTAAATTTC lltto 
TACTTTGTCT TTTTCTGAAA ATCrraoaA^ ^^^^^ TCATATAGGT CGACCGGTGG 19680 
ATAAATTATC TGAtSS^ JJS???^ g5S??JSt ^S^S^^^ CTGGTGGTCG IVyll 
TTTGTGTTGT TTTGTTTTGT TTTGTTTTG? T^^^^^^^^ ^^°Z5ZI^^ 19800 
TTTGTGTTGT GTTGTGTTCT CTTGtSS^ jnS.!^^^ TTTGTTTTGT TTTGTTTTGT 19860 
GTTGGGTTGG 'gSS??S? SS????^ ^^^^"^ GTTGGGTtSI "92 S 

p-GTTTGCTG TTGTTTTGTG TTTTGCG^T ^^rfJ^n IT^^^^TGTTGT TGGTTTTGTT 19980 
TACACAAACA TGCACTTTTT tSSaJSI ™?^fIIS AGTTTTTTTG 20040 

TATCCCTTTC CTTCTCTCTC TTTTTTAaaa T^^^^^ AAATGCGAAA ATCGACCAAT 20100 
TGTGTGTGTG TGCGTctSS C^f^S J^^GTGTGTG TGTGTGTGTG 20^6^ 

TACTTATAAT AATAGGTCGC CGgStS??! GCGCGCGCTC GTTTTATAAA 20220 

GCAGACTTCT GAGTTCGaIg CcSJ™? OTf^n^^^^'^^^^'^ GCAGAGGCAG 2ol8 0 
AATAAATACA TACATACaS CAtSJS Srf^?™ ACCCTGTCTC GAAAAATGAA 2034 0 
GTTGACCAGT TGTCAATCCT SagS??S r^^^^l^ CATACATACA TACATATGAG 20400 
AATAGATAGA TGGATAGAGT StaJJSS AATGTGATAG AGAGATAGAT 20460 

ATTAACCACT TTtCCCI^ ^Goi^^'^^^^- JIIS^'?^ 20525 " 

S?S?5?S? T^???G^ ciiiii ™- 2?I?2S?S 2%°l4^S 

_^--™_"TCTTCT^ SSSISS S^^^ 
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TCTGTAGACC 
CAATTTTGGA 
ATTTTATTAT 
ATTAGTTGGA 
TTGTGGGGCT 
GATTTTTGTA 
TTTCATTGCT 
CCAGTTCCTC 
GATGTGCTAG 
AAAAGTTCTA 
TGTTCTCACT 
AGACATATAT 
TTCCCAGACG 
CACCACAACT 
GGAAAAGCAT 
GGTTGTGAAC 
AGTCAGGGCT 
TGAATGATCC 
ATAAAATAAT 
CTCACAGCAC 
GAGGGGTGGG 
ATGGCGTGGT 
TACCTGAAGT 



AGCCTGGCCT 
GTAAAGGTGT 
TAGACAGAAC 
CCAATTAGTT 
GGGGATCAGG 
AAGATTACTT 
TCATTTCTAT 
CTGCCTTCTG 
TGAACCAGAG 
ACAAAGTGAT 
CTGCCACCAA 
TTTTTCTTTT 
GCCTTTTGAG 
CTAACCTGTT 
GTAGCAGTTG 
CACCCACCAT 
CTAAACCGAT 
CAGGATGGGA 
GAAATGAATG 
CTCCCCCTCC 
GTGGGGGCAG 
TCTCTGAACT 
CCCTGAGTGA 



CAATCGAACT 
GCTACACCAC 
GAAATCAACT 
GGCTGGTTTG 
TATCTCAACG 
TTCTTAGTCT 
TTCTCTTTCT 
GAAGATGTAG 
AGTTTGGATG 
CTTTAACTTT 
CGCGCTTTGT 
GGTTTTGCTT 
AATAAAATGG 
TGGCTGTTTT 
TAGGACACAC 
GTGGTTGCCT 
GAGCCATCTC 
AGACAGTCTG 
AAGTCTCCAC 
CCCACACTGC 
GGATCTGCAT 
GTTGAGCCTT 
TGATTTCCCT 



CAGAAATCCT 
TGCCTGGCAT 
AGTTGGTCCT 
GGAGGTTTCT 
GAATGCATGA 
GAGGAAAAAA 
TTCTTTCTTT 
GCATTGCATT 
TCAAGCCGTA 
TTTTTTTTTT 
ACATTGAATG 
GAGATGGTTT 
G AGG CCAGAA 
CCTTCCCAAG 
TAGACGAGAG 
GGGATTTGAA 
TCCAGCCCTC 
CCCTCTTTGT 
GTATTTATTT 
CTTTCTCCCT 
GTCTTCTTGC 
GTCTATCGAG 
GTGAATTC 



CCTGCCTGTT 
TATTATCATT 
GTTTCGTTAA 
TTTGTTTCCG 
AGGTTAAGGT 
TAAAATAATA 
CTTTCAGATA 
GGGAAAAGCA 
TAATGTTTAT 
TTTCTCCTTC 
TGAGCTTTGT 
CCCTTTCTAT 
CCAAAGTCTT 
GCACAGATCT 
CACCAGATCT 
CTCAGGATCT 
CTACATTCCT 
GGTATATCAC 
CTTCGAGCTA 
ATGTTTGGGT 
AGGTCTGTGA 
AGGCTGACTG 



GTCTACCTCC 
ATCATTATTA 
TTCATTTGAA 
ATTTGGGTGT 
GAGATGGCTC 
TTGGGCTACG 
AGGAGGTCGG. 
TTGTTTGAGA 
TACAATATAG 
TACTTCTACT 
TTTGCTTAAC 
CCGTGCAGGG 
TfGAATAAAG 
TTCCCAGCAT 
CATTGTGGGT 
TCAGAAGACG 
TCTTAAGGCA 
CATATACrCA 
TCTAAATTCT 
GGGGCTGGGG 
AGTATTTGCG 
GCTAGTTTTG 



20820 
20880 
20940 
21000 
21060 
21120 
21180 
21240 
21300 
21360 
21420 
21480 
21540 
21600 
21660 
21720 
21780 
21840 
21900 
21960 
22020 
22 080, 
22118. 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42999 base pairs 

(B) TYPE: nucleic acid . 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE 

(iii) HYPOTHETICAL 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 



Genomic DNA 
NO 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: 



GCTGACACGC 
GGGCTCTGGC 
CGGCCCGCGG 
GGGTCCGGGT 
CGTGCGCTCT 
CGTGCGTGTC 
GGGTGGCCGG 
ATCGATGTGG 
GACGTTCGTG 
CCTCTCCCCG 
TCGCCGTCCC 
GCAGACAGCC 
GCGGTGGGGG 
GCGTCGGCTC 
GCCCGAGGCC 
CCGCGGTGTC 
AGTGAGACGA 
GAGCGCACGT 
GGGTTCGGGC 
CGACCGGTCG 
GGACGGGGGG 



TGTCCTCTGG 
CTCACGGTC3A 
GCCTGCTGTT 
CTCTGACCCA 
CCGCTGCGGG 
AGGCGTTCTC 
AGCCGATCGG 
TGACGTCGTG 
GCGAAGGGGA 
CCCGCCGGCC 
GCCCGCCGCC 
CTGCCTGTCG 
TGCCGTCCCG 
CGCCTGGGCC 
GAACGGTGGT 
GGCGCGTGGG 
GACGAGACGC 
CCCGTGCTCC 
CGGTGTGACG 
TGTGTGGGTT 
GCCTGGTGGG 



CGACCTGTCG 
CCGGCTAGCC 
CTCTCGCGCG 
CCCGGGGGCG 
CGCCCGGGGC 
GTCTCCGCGG 
CTCGCTGGCC 
CTCTCCCGGG 
CCGTCCTTCT 
GGCGTGTGGG 
TTCGCTTCGC 
CCTCCAGTGG 
CCGGCCCGTC 
CTTGCGGTGC 
GTGTCGTTCC 
TCCTGAGGGA 
GCCCCTCCCA 
CCTCTGGCGG 
CGTGCGCCGG 
GACTTCGGAG 
GTTGCGCGCA 



TCGGAGAGGT 
GGCCGCGCTC 
TCCGAGCGTC 
GCGGGGAAGG 
GCCGCACAAC 
GGTTGTCCGC 
GGCCGGCCTC 
CCGGGTCCGA 
CGCTCCGCCC 
AAGGCGtGGG 
GGGTGCGGGC 
TTGTCGACTT 
GTGCTGCCCT 
TCCTGGAGCG 
CGCCCCCGGC 
GCTCGTCGGT 
CGCGGGGAAG 
GTGCGCGCGG 
CCGGCCGGCG 
GCGCTCTGCC 
CGCGCGCACC 



TGGGCCTCCG 
CTGCCTTGAG 
CCGACTCCCG 
CGGCGAGGGC 
CCCACCCGCT 
CGCCCCTTCC 
CGCTCCCGGG 
GGCGCGACGG 
GCGCGGTCCC 
GTGCGGACCC 
CGGCGGGGTC 
GGGGGCGGCC 
CTCGGGGGGG 
CTCCGGGTTG 
GCCCCCTCCT 
GTGGGGTTCG 
GGCGCCCGCC 
GCCGTGTGAG 
AGGGGCTGCC 
TCGGAAGGAA 
GGCCGGGCCC 



GATGCGCGCG 
CCGCCTGCCG 
GTGCCGGCCC 
CACCGTGCCC 
GGCTCCGTGC 
CCGGAGTGGG 
GGGCTCTTCG 
GCGAGGGGCG 
CTCGTCTGCT 
CGGCCCGACC 
CTCTGACGCG 
CCCCTCCGCG 
GTTTGCGCGA 
TCCCTCAGGT 
CCGGTCGCCG 
AGGCGGTTTG 
TGCTCTCGGT 
CGATCGCGGT 
GTTCTGCCTC 
GGAGGTGGGT 
CCGCCCTGAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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, . CGCCSAACGCT 
CCCCAGGCGT 
"GTGTGACCCA 
GTACCGGATC 
GCGACCCGCT 
CGAGGGTTCC 
- _ „ lGCGCCTCCCC^ 
AGCGGGTTGG 
CGCGTGACCC 
GGTGCCGACG 
CCCACGGTGG 
CACGAGCGAC 
GTGGCGGGGC 
CCCAGGCGGG 
TTTTCCTGGT 
CCGGGTGCCC 
GCCGCCCGCC 
CCCCGCCTGG 
CCCGGCGTCC 
GGCCCGGTGG 
ACGGTTCCGG 
TCGCCGAGGG 
CCGCGCGTGT 
CCC GCCGGCC 
CTTCTCCCGG 
CGGGCCTGCC 
CGTCCTCCCC 
- GGTCCCTCCC 
TCGCTCGCCC 
CTGCCTCCCG 
CGTCGTGTGG 
TCGGCCGGGC 
GACTGTCCCC 
GCCCCGCGGG 
GCGCGCGCGC 
CTCCTCGCGG 
GTCGTGTCGC 
CCACCGGTCC 
CCCGTCCGTC 
AGGCCCCCCG 
TGGTTGATCC 
CGCACGGCCG 
CGCTCGCTCC 
GCGCTGACCC 
GCCCCTCTCC 
TCGGGCCGAT 
CTTTCGATGG 
GATTCCGGAG 
AATTACGCAG 
AGGCCCTGTA 
AGTCTGGTGC 
GTTAAAAAGC 
CCGCCCGTCC 
CGGGGCCCGA 
GGATACCGCA 
CTGAGGCCAT 
TTCTTGGACC 
TCAAGAACGA 
CGATGCCGAC 
ACCAAAGTCT 

cggaagggca' 



CGAGGTGGCC GCGCGCAGGT 
CCCTCGGCGC CTCTGCGGGC 
CCCTCGGTGA GAAAAGCCTT 
CCCCGGGCCG CCGCCTCTGT 

cgcagaggac cctcctccgc 
gccggccacc gcggtggtgg 

C-TTCCGAGTC- GGGGGAGGAT 
GACGCGGCGG CCGGCGGGCG 
CCTCCGTCCG CGAGTCGGCT 
ACCGCGTTTG CGTGGCACGG 
GGGCGCGCCG GTCTCCCGGA 
GGTGGTGGTG GCGTGTCGGG 
CCCGGGGCTC GCGAGGCGGT 
GCGCCGCGGG ACCGCCCTCG 
GGCCCGGCCG TGCCTGAGGT 
TTGCCCTCGC GGTCCCCGGC 
GATCCTCTTC TTCCCCCCGA 
GACCGAACCC GGCACCGCCT 
GCGTCCCCCG. GCGCGCGCCT 
GCTTCCCGGA 6GGTTCCGGG 
GGGACCGGCC GCGGCTGCGG 
CCGGTCGGCC GCCCCGGGTG 
GTCCCGGCTG CGGTCGGCCG 
GCCTTTCTCG CGCCTTCCCC 
CCCGCTCTTC CGAACCGGGT 
GCGGCCCTTC CCCGAGGCGT 
GCGTGGCGTC GCCCCGTTCG 
GGACAGGCGT TCGTGCGACG 
TCTCCCCGGG TCGGGGGGTG 
TCCCGGGCGG GGGCGGGCGC 
CGTGTGCCAC CCCTGCGCCG 
CCCGGGCCCT, CGACCGGACC 
GGGCCGGGCA CCGCGGTCCG 
GCGGGCGGAG CGCCGTCCCC 
GCGTGGCCGC CGGTCCCTCC 
GCGGGCGCGA CGAAGAAGCG 
GTGGGGGGCG GGTGGTTGGG 
CGGCCGCCGC CCCCGCGCCC 
CGTCCGTCCG TCGTCCTCCT 
GCCGGCCGTC CGGCCGCGTC 
TGCCAGTAGC ATATGCTTGT 
GTACAGTGAA ACTGCGAATG 
TCTCCTACTT. GGATAACTGT 
CCTTCGCGGG GGGGATGCGT 
GGCCCCGGCC GGGGGGCGGG 
CGCACGCCCC CCGTGGCGGC 
TAGTCGCCGT GCCTACCATG 
AGGGAGCCTG AGAAACGGCT 
TCCCGACCCG GGGAGGTAGT 
ATTGGAATGA GTCCACTTTA 
CAGCAGCCGC GGTAATTCCA 
TCGTAGTTGG ATCTTGGGAG 
CCGCCCCTTG CCTCTCGGCG 
AGCGTTTACT TTGAAAAAAT 
GCTAGGAATA ATGGAATAGG 
GATTAAGAGG GACGGCCGGG 
GGCGCAAGAC GGACCAGAGC 
AAGTCGGAGG TTGGAAGACG 
CGGCGATGCG GCGGCGTTAT 
TTGGGTTCCG GGGGGAGTAT 
CCACCAGGAG TGGAGCCTGC 



GTTTCCTCGT 
CCGAGGAGGA 
CTCTAGCGAT 
CTCTGCCTCC 
TTCCCCCTCG 
CCGAGTGCGG^ 
"CCCGCCGGGC 
GTGGGTGTGC 
CTCCGCCCGC 
GGTCGGGCCC 
GCGGGACCGG 
TTCGTGGCTG 
TCTCGGTGGG 
TGTGTGTGGC 
TTCTCCCCGA 
CCTCGCCCGT 
GCGGCTCACC 
CGTGGGGCGC 
TGGGGACCGG 
GGTCGGCCTG 
CGGCGGCGGT 
CCCCGCGGTG 
CGCTCGAGGG 
GTCGCCCCGG 
CGGCGCGTCC 
CCGTCCCGGG 
GCGCGCGCGT 
TGTGGCGTGG 
GGGCCCGGGC 
GCCGGCCGGC 
GCGCCCGCCG 
GGCTGCGCGG 
CCTCTCGCTC 
GCCTCGCCGC 
CGGCCGCCGG 
TCGCGGGTCT 
GCGTCCGGTT 
GCTCGCTCCC 
CGCTTGCGGG 
GGGGGCTCGC 
CTCAAAGATT 
GCTCATTAAA 
GGTAATTCTA 
GCATTTATCA 
CGCCGGCGGC 
GACGACCCAT 
GTGACCACGG 
ACCACATCCA 
GACGAAAAAT 
AATCCTTTAA 
GCTCCAATAG 
CGGGCGGGCG 
CCCCCTCGAT 
TAGAGTGTTC 
ACCGCGGTTC 
GGCATTCGTA 
GAAAGCATTT 
ATCAGATACC 
TCCCATGACC 
GGTTGCAAAG 
GGCTTAATTT 



ACCGCAGGGC 
GCGGCTGGCG 
CTGAGAGGCG 
GTTATGGTAG 
ACGGGGTTGG 
CTCG.TCGCCT 
CGGGCCCGGC 
GCGCCCGGCG 
TCCCGTGCCG 
GCCTGGCCCT 
GTCGGAGGAT 
CGGTCGCTCC 
GGCCGAGGGC 
GGTGGGATCC 
GCCGCCGCCT 
GTGTGCCCTC 
GGCTTCACGT 
CGCCGCCGGC 
GTCGGTGGCG 
CGGCGCGTGC 
GGTGGGGGGA 
CCGCCGGCGG 
GTCCCCGTGG 
CCTCGCCCGT 
CCCGGGTGCG 
CGTCGGCGTC 
GCGCCCGAGC 
GTCGACCTCC 
CGGGGCCTCG 
CTCGGTCGCC 
GCGGGGCTCG 
GCGCTGCGGC 
GCCGCCCGGA 
CGCCCGCGGG 
GCGCGGGTCG 
GTGGCGCGGG 
CGCCGCGCCC 
TCCCGTCCGC 
GCGCCGGGCC 
CGCGCTCTAC 
AAGCCATGCA 
TCAGTTATGG 
GAGCTAATAC 
GATCAAAACC 
TTTGGTGACT 
TCGAACGTCT 
GTGACGGGGA 
AGGAAGGCAG 
AACAATACAG 
CGAGGATCCA 
CGTATATTAA 
GTCCGCCGCG 
GCTCTTAGCT 
AAAGCAGGCC 
TATTTTGTTG 
TTGCGCCGCT 
GCCAAGAATG 
GTCGTAGTTC 
CGCCGGGGAG 
CTGAAACTTA. 
GACTCAACAC 



CCCCTCCCTT 
GGTGGGGGGA 
TGCCTTGGGG 
CGGTGCCGTA 
GGGGGAGAAG 
-ACTGTGGCGC- 
GCTCCCACCC 
CTCTGTCCGG 
AGTCGTGACC 
GGGAAAGCGT 
GGACGAGAAT 
GGGGCCCCCG 
CGTCCGGCGT 
CGCGGCCGTG 
CTGCGGGCTC 
TTCCCCGCCC 
CCGTTGGTGG 
CACTGATCGG 
CGCCGCGTGG; 
GGGGGAGGAG 
GCCGCGGGGA 
CGGTGAGGCC 
CGTCCCCTTC 
GGTCTCTCGT 
CCTCGCTTCC 
GGGGAGAGCC 
GCGGCCCGGT 
GCCTTGCCGG : 
GCCCCGGTCG 
CTCCCTTGGC 
GAGCCGGGCT 
CGCACGGCGC. 
CGTCGGGGCC 
CGCCGGCGGC 
GGCCGTCCGC 
GCCCCCGGTG 
GGCCCCGGCC 
CCGTCCGCGG . ' 
CGTCCTCGCG 
CTTACCTACC 
TGTCTAAGTA 
TTCCTTTGGT 
ATGCCGACGG " 
AACCCGGTCA 
CTAGATAACC. 
GCCCTATCAA 
ATCAGGGTTC 
CAGGCGCGCA 
GACTCTTTCG 
TTGGAGGGCA 
AGTTGCTGCA 
AGGCGAGCCA , 
GAGTGTCCCG 
C GAGC CGCCT. 
GTTTTCGGAA • 

-agaggtgaaa . . 
ttttcattaa: 
cgaccataaa 

CTTCCGGGAA. 
AAGGAATTGA.. J. 
GGGAAACCTC 



. 1320 
1380 
1440 
■ 1500 
1560 
-1620 
1680 
1740 
1800 
- 1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 : 
2640 
2700 
2760 
2820 • 
2880 
294 0 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540., 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 

42oo: . 

4260 * 

4320 

4380 

4440. 

4500 

4560 

4620- 

4680 

4740 

4800 . 

4860^- 

4920 
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ACCCGGCCCG GACACGGACA GGATTGACAG ATTGATAGCT CTTTCTCGAT TCCGTGGGTG 4 980 

GTGGTGCATG GCCGTTCTTA GTTGGTGGAG CGATTTGTCT ' GGTTAATTCC GATAACGAAC- , 5040 

GAGACTCTGG CATGCTAACT AGTTACGCGA CCCCCGAGCG GTCGGCGTCC CCCAACTTCT 5100 

TAGAGGGACA AGTGGCGTTC AGCCACCCGA GATTGAGCAA TAACAGGtCT GTGATGCCCT 5160 

TAGATGTCCG GGGCTGCACG CGCGGTACAC TGACTGGCTC AGCGTGTGCC TACCCTACGC 5220 

CGGCAGGCGC GGGTAACGCG TTGAACCCCA TTCGTGATGG GGATCGGGGA TTGCAATTAT 5280 

TCCCCATGAA CGAGGGAATT CCCGAGTAAG TGCGGGTCAT AAGCTTGCGT TGATTAAGTC 534 0 

CCTGCCCTTT GTACACACCG CCCGTCGCTA CTACCGATTG GATGGTTTAG TGAGGCCCTC 5400 

GGATCGGCCC CGCCGGGGTC GGCCCACGGC CCTGGCGGAG CGCTGAGAAG ACGGTCGAAC 5460 

TTGACTATCT AGAGGAAGTA AAAGTCGTAA CAAGGTTTCC GTAGGTGAAC CTGCGGAAGG 552 0 

ATCATTAACG GAGCCCGGAG GGCGAGGCCC GCGGCGGCGC CGCCGCCGCC GCGCGCTTCC 5580 

CTCCGCACAC CCACCCCCCC ACGGCGACGC GGCGCGTGCG CGGGCGGGGC CCGCGTGCCC 564 0 

GTTCGTTCGC TCGCTCGTTC GTTCGCCGCC CGGCCCCGCC GCCGCGAGAG CCGAGAACTC 57 00 

GGGAGGGAGA CGGGGGGGAG AGAGAGAGAG AGAGAGAGAG AGAGAGAGAG AGAGAGAGAA 5760 

AGAAGGGCGT GTCGTTGGTG TGCGCGTGTC GTGGGGCCGG CGGGCGGCGG GGAGCGGTCC 5820 

CCGGCCGCGG CCCCGACGAC GTGGGTGTCG GCGGGCGCGG GGGCGGTTCT CGGCGGCGTC 5880 

GCGGCGGGTC TGGGGGGGTC TCGGTGCCCT CCTCCCCGCC GGGGCCCGTC GTCCGGCCCC .5940 
GCCGCGCCGG- CTCCCCGTCT TCGGGGCCGG CCGGATTCCC GTCGCCTCCG CCGCGCCGCT . 6000 

CCGCGCCGCC^GGGCACGGCC CCGCTCGCTC TCCCCGGCCT tCCCGCTAGG GCGTCTCGAG 6060 

GGTCGGGGGC CGGACGCCGG TCCCCTCCCC CGCCTCCTCG TCCGCCCCCC CGCCGTCCAG 6120 

GTACCTAGCG CGTTCCGGCG CGGAGGTTTA* AAGACCCCTT GGGGGGATCG CCCGTCCGCC 6180 

CGTGGGTCGG GGGCGGTGGT GGGCCCGCGG GGGAGTCCCG TCGGGAGGGG CCCGGCCCCT 624 0 

CCCGCGCCTC CACCGCGGAC TCCGCTCCCC GGCCGGGGCC GCGCCGCCGC CGCCGCCGCG 63 00 

GCGGCCGTCG GGTGGGGGCT TTACCCGGCG GCCGTCGCGC GCCTGCCGCG CGTGTGGCGT 6360 

GCGCCCCGCG CCGTGGGGGC GGGAACCCCC GGGCGCCTGT GGGGTGGTGT CCGCGCTCGC 6420 

CCCCGCGTGG GCGGCGCGCG CCTCCCCGTG GTGTGAAACC TTCCGACCCC TCTCCGGAGT 6480 

CCGGTCCCGT TTGCTGTCTC GTCTGGCCGG CCTGAGGCAA CCCCCTCTCC TCTTGGGCGG , 654 0 

GGGGGGCGGG GGGACGTGCC GCGCCAGGAA GGGCCTCCTC CCGGTGCGTC GTCGGGAGCG 66 00 

CCCTCGCCAA ATCGACCTCG TACGACTCTT AGCGGTGGAT CACTCGGCTC GTGCGTCGAT 6660 

GAAGAACGCA GCTAGCTGCG AGAATTAATG TGAATTGCAG GACACATTGA TCATCGACAC 6720 

TTCGAACGCA CTTGCGGCCC CGGGTTCCTC CCGGGGCTAC GCCTGTCTGA GCGTCGCTTG 6780 

CCGATCAATC GCCCCGGGGG TGCCTCCGGG CTCCTCGGGG TGCGCGGCTG GGGGTTCCCT 684^0 

CGCAGGGCCC GCCGGGGGCC CTCCGTCCCC CTAAGCGCAG ACCCGGCGGC GTCCGCCCTC 690^. 

CTCTTGCCGC CGCGCCCGCC CCTTCCCCCT CCCCCCGCGG GCCCTGCGTG GTCACGCGTC . 6960 

GGGTGGCGGG GGGGAGAGGG GGGCGCGCCC GGCTGAGAGA GACGGGGAGG GCGGCGGCGC 7020 

CGCCGGAAGA CGGAGAGGGA AAGAGAGAGC CGGCTCGGGC CGAGTTCCCG TGGCCGCCGC .7080 

CTGCGGTCCG GGTTCCTCCC TCGGGGGGCT CCCTCGCGCC GCGCGCGGCT CGGGGTTCGG .7140 

GGTTCGTCGG CCCCGGCCGG GTGGAAGGTC CCGTGCCCGT CGTCGTCGTC GTCGCGCGTC 72 0 0 

GTCGGCGGTG GGGGCGTGTT GCGTGCGGTG TGGTGGTGGG GGAGGAGGAA . GGCGGGTCCG 7260 

GAAGGGGAAG GGTGCCGGCG GGGAGAGAGG GTCGGGGGAG CGCGTCCCGG TCGCCGCGGT 7320 

TCCGCCGCCC GCCCCCGGTG GCGGCCCGGC GTCCGGCCGA CCGGCCGCTC CCCGCGCCCC 7380 

TCCTCCTCCC CGCCGCCCCT CCTCCGAGGC CCCGCCCGTC CTCCTCGCCC TCCCCGCGCG 744 0 

TACGCGCGCG CGCCCGCCCG CCCGGCTCGC CTCGCGGCGC GTCGGCCGGG GCCGGGAGCC 7 5 00 

CGCCCCGCCG CCCGCCCGTG GCGGCGGCGC CGGGGTTCGG GTGTCCCCGG CGGCGACCCG 7560 

CGGGACGCCG CGGTGTCGTC CGCCGTCGCG CGCCCGCCTC CGGCTCGCGG CCGGGCCGCG 7620 

CCGCGCCGGG GCCCCGTCCC GAGCTTCCGC GTCGGGGCGG CGCGGCTCCG CCGCCGCGTC 76 80 

CTCGGACCCG TCCCCCCGAC CTCCGCGGGG GAGACGCGCC GGGGCGTGCG GCGCCCGTCC 7740 

CGCCCCCGGC CCGTGCCCGT CCCTCCGGTC GTCCCGCTCC GGCGGGGCGG CGCGGGGGCG 78 00 

CCGTCGGCCG CGCGCTCTCT CTCCCGTCGC CTCTCCCCCT CGCCGGGCCC GTCTCCCGAC 7860 

GGAGCGTCGG GCGGGCGGTC GGGCCGGCGC GATTCCGTCC GTCCGTCCGC CGAGCGGCCC 7 920 

GTCCCGCTCC GAGACGCGAC CTCAGATCAG ACGTGGCGAC CCGCTGAATT TAAGCATATT 7980 

AGTCAGCGGA GGAAAAGAAA CTAACCAGGA TTCCCTCAGT AACGGCGAGT GAACAGGGAA 8 040 

GAGCCCAGCG CCGAATCCCC GCCCCGGGGG GCGCGGGACA TGTGGCGTAC GGAAGACCCG 8100 

CTCCCCGGCG CCGCTCGTGG GGGGCCCAAG TCCTTCTGAT CGAGGCCCAG CCCGTGGACG 8160 

GTGTGAGGCC GGTAGCGGCC GGCGCGCGCC CGGGTCTTCC CGGAGTCGGG TTGCTTGGGA 8220 

ATGCAGCCCA AAGCGGGTGG TAAACTCCAT CTAAGGCTAA ATAGCGGCAC GAGACCGATA 8280 

GTCAACAAGT ACCGTAAGGG AAAGTTGAAA AGAACTTTGA AGAGAGAGTT CAAGAGGGCG 8340 

TGAAACCGTT AAGAGGTAAA CGGGTGGGGT CCGCGCAGTC CGCCCGGAGG ATTCAACCCG 84 00 

GCGGCGGGTC CGGCCGTGTC GGCGGCCCGG CGGATCTTTC CCGCCCCCCG TTCCTCCCGA 8460 

CCCCTCCACC CGCCCTCCCT TCCCCCGCCG CCCCTCCTCC TCCTCCCCGG AGGGGGCGGG 8520 

CTCCGGCGGG TGCGGGGGTG GGGGGGCGGG GCCGGGGGTG GGGTCGGCGG GGGACCGTCC 85 80 
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. CCCGACCGGC- GACCGGCCGC 
CTCCGGGACG GCTGGGAAGG 
CCGTCCTCCT CCTCCCCCGT 
CGGGTCGGGG CGGCGGCGGC 
CCCCCCCCGA GTGTTACAGC 
AGCGAGACCC GTCGCCGCGC 
^ -GCGAGGGGGG-TCTCCCCCGC 
CCCTCCCACG GCGCGACCGC 
GGGGGGTGCC GCGCGCGGGT 
- , GTCGCGCCGT CGGGCCCGGG 
GGGACGGCGG AGCGAGCGCA 
TGAAACACGG ACCAAGGAGT 
GTGGCGCAAT GAAGGTGAAG 
TCCAGTCCGC CGAGGGCGCA 
ACGAGCGCAC GTGTTAGGAC 
AGGAAACTCT GGTGGAGGTG 
TATAGGGGCG AAAGACTAAT 
GGATAGCTGG CGCTCTCGCA 
; CGAATGATTA GAGGTCTTGG 
TAAGAAGCCC GGCTCGCTGG 
TTTTGGTAAG. CAGAACTGGC 
CCGACGCTCA TCAGACCCCA 
TGGAAGTCGG AATCCGCTAA 
AAAATGGATG GCGCTGGAGC 
ACGGGAGCGG CGGGGGCGGC 
CGGCGGCGGC GGCGGGGGTG 
; TCCTCCCGCC CACGCCCCGC 
- ; GTAGGAGGGC CGGTGCGGTG 
CAGGTGCAGA TCTTGGTGGT 
GAGAAGGGTT CCATGTGAAC 
AGCGCCGTTC CGAAGGGACG 
CGGGTTCAGA TCCCCGAATC. 
■ TAACGCGACC QATCCCGGAG 
GAAGGGCAGG GCGCCCTGGA 
GTCGCGGTTC CGGCGGCGTC 
GTGTAAATCT CGGGCCGGGC 
CTGGCATGTT GGAACAATGT 
TAAGGATTGG CTCTAAGGGC 
CGCCGCGGCT GGACGAGGCG 
^SSH^^^CCC ACCCGCGCGC 
CTCTCCCCCG CTCCCCGTCC 
GGGGAGAAGG GTCGGGGCGG 
AGGTCCCCGC GAGGGGGGCC 
GCQAGCCGGG CCCTTCCCGT 
GGAGCCCGGC GGCGGCGCGG 
GTCCGCTGGG GGCGGGAGCG 
TCGTCCCCCC GCCCTACCCC 
GCGCGGCGGC GGCGGCGGCA 
CGCCCCCGGG GCCGCGGTTC 
AGAACTGGTG CGGACCAGGG 
GGCGGCGGGT GTTGACGCGA 
ATTCAATGAA GCGCGGGTAA 
CCTCGTCATC TAATTAGTGA 
TACTATCCAG CGAAACCACA 
ACCCTGTTGA GCTTGACTCT 
TGGGAGGCCC CCGGCGCCCC 
: CCTGCGGGGC-GCCGGTGAAA 
GGGGGGGCGA GCCCGAGGGG 
ACCCGCTCCG GGGACAGTGC 
GTAACGCAGG TGTCCTAAGG 
_ GCAAAAGCTC GCTTGATCTT 



CGCCiSGGCGC 
GCGGGCGGGG 
' CTCCGCCCCC 
GGCGGGGGTG 
CCCCCCGGCA 
TCTCCCCCCT 
GGGGGCGCGC 
TCTCCCACCC 
CGGGGGGCGG 
GGAGGTTCTC 
CGGGGTCGGC 
CTAACACGTG 
GCCGGCGCGC 
CCACCGGCCC 
CCGAAAGATG 
CGTAGCGGTC 
CGAACGATCT 
GACCCGACGC 
GGCCGAAA.CG 
CGTGGAGGCG 
GCTGCGGGAT 
GAAAAGGTGT 
GGAGTGTGTA 
GTCGGGCCCA 
GCGCGCGCGC 
TGGGGTCCTT 
TCCCCGCCCC. 
AGCCTTGAAG 
AGTAGCAAAT 
AGCAGTTGAA 
GGCGATGGCC 
CGGAGTGGCG 
AAGCCGGCGG 
ATGGGTTCGC 
CGGTGAGCTC 
CGTACCCATA 
AGGTAAGGGA 
TGGGTCGGTC 
CGCGCCCCCC 
GCCGCTCGCT 
TCCCCCCTCC 
CAGGGGCCGC 
CCGGGGACCC 
GGATCGCGCG 
CGCGCCCCCC 
GTCGGGCGGC 
CCCGGCCCCG 
GGCGGCGGAG 
CGCGCGCGCC 
GAATCCGACT 
TGTGATTTCT 
ACGGCGGGAG 
CGCGCATGAA 
GCCAAGGGAA 
AGTCTGGCAC 
CCCGGTGTCC 
TACCACTACT 
CTCTCGCTTC 
CAGGTGGGGA 
CGAGCTCAGG 
GATTTTCAGT 



ATTTCCACCG 
AAGGTGGCTC 
CGGCCCCGCG 
GCGGCGGCGG 
GCAGCACTCG 
_CCCGGCGGCC- 
CGGCGTCTCC 
CTCCTCCCCG 
GGCGGACTGT 
TCGGGGCCAC 
GGCGACGTCG 
CGCGAGTCGG 
TCGCCGGCCG 
GTCTCGCCCG 
GTGAACTATG 
CTGACGTGCA 
AGTAGCTGGT 
ACCCCCGCCA 
ATCTCAACCT 
GGCGTGGAAT 
GAACCGAACG- 
TGGTTGATAT 
ACAACTCACC 
TACCCGGCCG 
GCGCGTGTGG 
CCCCCGCCCC 
CGGAGCCCCG 
CCTAGGGCGC 
ATTCAAACGA 
CATGGGTCAG 
TCCGTTGCCC 
GAGATGGGCG 
GAGCCCCGGG 
CCCGAGAGAG 
TCGCTGGCCC 
TCCGCAGCAG 
AGTCGGCAAG 
GGGCTGGGGC 
CCACGCCCGG 
CCCTCCCCAC 
CCGGGGGAGC 
GCGGCGGCGG 
GGGGGGCCGG 
AGCTGCGGCG 
ACCCCCACCC 
GGCGGTCGGC 
TCCGCCCCCC 
GGGCCGCGGG 
TCGCCTCGGC 
GTTTAATTAA 
GCCCAGTGCT 
TAACTATGAC 
TGGATGAACG 
CGGGCTTGGC 
GGTGAAGAGA 
CCGCGAGGGG . 
CTGATCGTTT 
TGGCGCCAAG 
GTTTGACTGG 
GAGGACAGAA 
ACGAATACAG 



CGGCGGTGCG 
GGGGGGCCCC 
TCCTCCCTCG 
CGGGGGCGGC 
CCGAATCCCG 
ACCCCCGCGG- 
TCGTGGGGGG 
CGCCCCCGCC 
CCCCAGTGCG 
GCGCGCGTCC 
GCTACCCACC 
GGGCTCGCAC 
AGGTGGGATC 
CCGCGCCGGG 
CCTGGGCAGG 
AATCGGTCGT 
TCCCTCCGAA 
CGCAGTTTTA 
ATTCTCAAAC 
GCGAGTGCCT 
CCGGGTTAAG 
AGACAGCAGG 
TGCCGAATCA 
TCGCCGGCAG 
TGTGCGTCGG 
CCCCCCCACG 
CGGACGCTAC 
GGGCCCGGGT 
GAACTTTGAA 
TCGGTCCTGA 
TCGGCCGATC 
CCGCGAGGGG 
GAGAGTTCTC 
GGGCCCGTGC 
TTGAAAATCC 
GTCTCCAAGG 
CCGGATCCGT 
GCGAAGCGGG 
GGCACCCCCC 
CCCGCGCCCT 
GCCGCGTGGG 
CGGGGGCGGC 
CGGCGGCGCG 
GGCGTCGCGG 
CACGTCTCGG 
GGGCGGCGGG 
GTTCCCCCCT 
CCGGTCCCCC 
CGGCGCCTAG 
AACAAAGCAT 
CTGAATGTCA 
TCTCTTAAGG 
AGATTCCCAC 
GGAATCAGCG 
CATGAGAGGT 
-CGGGGGGCGG 
TTTCACTGAC 
CGCCCGCCCG 
GGCGGTACAC 
ACCTCCCGTG. 
ACCGTGAAAG 



CCGCGACCGG 
GTCCGTCCGT 
GGAGGGCGCG 
GGGACCGAAA 
GGGCCGAGGG 
GGAATCCCCC" 
GCCGGGCCAC 
CCGGCGACGG 
CCCCGGGCGG 
CCCGAAGAGG 
CGACCCGTCT 
GAAAGCCGCC 
CCGAGGCCTC 
GAGGTGGAGC 
GCGAAGCGAG 
CCGA CCTGGG 
GTTTCCCTCA 
TCC GGTAAAG 
TTTAAATGGG. 
AGTGGGCCAC 
GCGCCCGATG 
ACGGTGGCCA- 
ACTAGCCCTG 
TCGAGAGTGG 
AGGGCGGCGG 
CCTCCTCCCC 
GCCGCGACGA 
GGAGCCGCCG 
GGCCGAAGTG 
GAGATGGGCG 
GAAAGGGAGT 
TCCAGTGCGG 
TTTTCTTTGT 
CTTGGAAAGC 
GGGGGAGAGG 
TGAACAGCCT 
AACTTCGGGA 
GCTGGGGGCG 
TCGCGGCCCT 
CTCTCTCTCT 
GGCGCGGCGG 
CGGCGGGGGC 
GACTCTGGAC 
CCGCCCCCGG 
TCGCGCGCGC 
GCGGGGCGGT 
CCTCCTCGGC 
CCGCGGGGTC 
CAGCCGACTT 
CGCGAAGGCC 
AAGTGAAGAA 
TAGCCAAATG 
TGTCCCTACC 
GGGAAAGAAG 
GTAGAATAAG . 
GGTCCGCGGC 
CCGGTGAGGC 
GCCGGGCGGG 
CTGTCAAACG 
GAGCAGAAGG- 
CGGGGCCTCA 



8640 
8700 
.8760 
8820 
_ 8880 
"8940 
9000 
9060 
9X20 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10 020 
10080 
1014 0 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940' 
12000 
12060 
12120 

12180 

12240 
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CGATCCTTCT 

ACTGGCTTGT 

TCTTCCTATC 

AACGTGAGCT 

TGTTGCCATG . 

GTGCTTGGCT 

TCTAAGTCAG 

TCGGATAGCC 

GCCGCGGGAG 

TCGTCCTGGG 

CACGTTCGTG 

TTCGTACGTA 

GGGTTTGTCC, 

CCGTCCTTCC 

GGGTGGGGGG 

TCCCCGTTCA 

GTTTGGGAGC 

CGGGGGTTGG 

TCCTCCTCGC 

GAGCCCCACG 

CCTCCGGGGT 

CGGGTCGACC 

GCAACCGGAG 

GCTCCGGACT 

GCAGCGGCGC 

CTCGGCCCGC 

CACCGCGAGT 

CTGCCTCCTC 

TTTTTTCTTT 

CTTCTTGTGT 

CTCTCGCTCT 

TCGCCCTCTC 

GTCGCTCTCG 

CCCTCCCTCC 

GCCGCTGTCT 

CAGTGCCCGT 

TCGACCAGCT 

ACCAGCAGGC 

GCGCCCGCGC 

TCCTACATTT 

ACTGCTGCCG 

TTCTTTCTTT 
TTTCTCTCTC 
TTTCTCTCTC 
TCCCCCTCCC 
TGTCTCGCCG 
CCCGTCGGGA 
CAGCTGCCGC 
CAGGCGGCCG 
CGGCGACCTC 
CACATTTTTT 
GCTGCCGTCA 
TTTCTTTCTT 
TCTCTGTCTC 
GCTGCTGCTG 
AAAGACGTAA 
CCGCCTCGGC 
TCCTTCCTTT 
CACACACACA 
AACTATGTAA 



GACCTTTTGG 
GGCGGCCAAG 
ATTGTGAAGC 
GGGTTTAGAC 
GTAATCCTGC 
GAGGAGCCAA 
AATCCCGCCC 
GGTCCCCCGC 
GGCGCGTGCC 
AAACGGGGCG^ 
GGGAACCTGG 
GCAGAGCAGC 
GCGCGCGCGT 
GTTCGTCTTC 
GAGGGCGCGC 
CGCCGGGGCG 
CGCGGAGGCG 
CCGCGCGGCG 
TCCTCCGCAC 
GGCGTCCCCG 
CGACCGCCTG 
GCGGCCTTCT 
CGTCCCCGTC 
TAGCCGGCGT 
ACGCACGCGA 
GGTGGAGCTG 
TTGCGTCCGC 
CTTTTTCGCT 
CTTTCTTTCT 
TCTCTTCTTG 
CGCCCTCTCT 
,TCTCtCTCTT 
CCCTCTCGCT 
CTCCCTCCCT 
CGCCGTACCC 
CGGGACGAGC 
GCCGCCCGCG 
GGCCGCCGGA 
CTCCACCGGC 
TTTTCAGCCC 
TCAGCCAGTA 

CTTTCGCTCT 
TCTCTCTCTC 
TCTCTGTCTC 
TCCCTCTCTC 
TGTCCCGGGT 
CGAGCCGGAC 
CCGCGAGCTC 
CCGGACGCTG 
CACCGGCCTC 
TCAGCCCCAC 
GCCAGTAATG 
TCTTTCTTTC 
TCTCTCTCTG 
CTGCCTCTGC 
TTTCACCATT 
CTCCCAAAGA 
TTTCAATCTT 
CACACACACA 
ATGATATTTC 



GTTTTAAGCA 
CGTTCATAGC 
AGAATTCGCC 
CGTCGTGAGA 
TCAGTACGAG 
TGGGGCGAAG 
AGGCGAACGA 
CTGTCCCCGC 
CCGCCGCGCG 
CGGCCGGAAA 
CGCTAAACCA 
TCCCTCGCTG 
GCGTGCGGGG 
CTCCCTCCCG 
GACCCCGGTC 
GCTCGTCCGC 
CCGCGGCGAG 
CGGTGGGGGG 
GGGTCGACCG 
CACCCGGCCG 
CGCCCGCGGG 
CCACCGAGCG 
TCGGTCGGCA 
CTGCACGTGT 
GGGCGTCGAT 
GGACCACGCG 
GGGACCTTTA' 
TTTAGGTTTT 
TTCTTTCTTT 
CTCTTCCTCT 
CTCTTCTCTC 
CTCTCTGTCT 
CTCTCTCTGT 
CCCTCCCCTT 
CGGGTCGACC 
CGGACCCGCC 
AGCTCCGGAC 
CGCAGCGGCG 
CTCGGCCCGC 
CACCGCGAGT 
CTGCCTCCTC 

CGCTCTCTCG 
TCTCTCTCTC 
TCTCTCTCTC 
CCCTTCCTTG 
CGACCGGCGG 
CCGCCGCGTC 
CGGAGTTAGC 
CGGCGCACCG 
GGCCCGCGGT 
CGCGAGTTTG 
CTTC CTCCTT 
TTTCTTTCTT 
TCTCTCTCCC 
CTCCACGGTT. 
TTGGCCGGGC 
CTGCTGGGAG 
ATTTTCTGAA 
CACACACACA 
CATAATTAAT 



GGAGGTGTCA 
GACGTCGCTT 
AAGCGTTGGA 
CAGGTTAGTT 
AGGAACCGCA 
CTACCATCTG 
TACGGCAGCG 
CGGCGGGCCG 
CCGGGACCGG 
GGCGGCCGCC 
TTCGTAGACG 
CGATCTATTG 
GGCCCGGCGG 
GCCTCTCCCG 
GGCCGCCCCG 
TCCGGGCCGG 
CCGGGCCCCG 
CCACCCGGGG 
ACGAACCGCG 
ACCTCCGCTC 
CGTGAGACTG 
GCGGTGTAGG 
CCTCCGGGGT 
CCCGGGTCGA 
TCCCCTTCGC 
GAACTCCCTC 
AGAGGGAGTC 
GCTTGCCTTT 
CTTTCTTTCT 
GTCTGTCTCT 
TCTCTCTCTC 
CTCTCTCTCT 
CTCTGTCTGT 
CCTTGGCGCC 
GGCGGGCCTT 
GCGTCCCCGT 
TTAGCCGGCG 
CACCGACGGA 
CGTGGAGCTG 
TTGCGTCCGC 
CTTTTTCGCT 
CTTTCTTTCT 
CTCTCTCCCT 
TCTGTCTCTC 
TCTCTCTCTC 
GCGCCTTCTC 
GCCTTCTCCA 
CCCGTCTCGG 
CGGCGTCTGC 
ACGCGAGGGC 
GGAGCTGGGA 
CGTCCGCGGG 
TTTTGCTTTT 
TCTTTCTTTC 
CTCCCTCCCT 
CAAGCAAACA 
TGGTCTCGAA 
TACAGATGTG 
CGCTGCCGTG 
CACACACACA 
ACGTTTATAT 



GAAAAGTTAC 
TTTGATCCTT 
TTGTTCACCC 
TTACCCTACT 
GGTTCAGACA 
TGGGATTATG 
CCGCGGAGCC 
CCCCCCCCTC 
GGTCCGGTGC 
CCCTCGCCCG 
ACCTGCTTCT 
AAAGTCAGCC 
GCGTGCGCGT 
CCGACCGCGG 
CTTCTTCGGT 
GACGGGGTCC 
TGGCCCGCCG 
TCCCGGCCCT 
GGTGGCGGGC 
GCGACCTCTC 
AGCGGCGTCT 
AGTGCCCGTC 
CGACCAGCTG 
CCAGCAGGCG 
GCGCCCGCGC 
TCCCACATTT 
ACTGCTGCCG 

TTCTTTCTTT ■ 
CTCTCTCTCT 
TCTCTCTCTG 
CTCTCTCTCT 
CTCTCTCTCT 
TTCTCGGCTC 
CTCCACCGAG 
CTCGGTCGGC 
TCTGCACGTG 
GGGGGCTGAT 
GGACCACGCG 
GGGACCTTTA 
TTTAGGTTTT 
TTCTTTCTTT 
CGCTCGTTTC 
GCTCTCGCCC 
TCTCTCTCTC 
GGCTCTTGAG 
CCGAGCGGCG 
TCGGCACCTC 
ACGTGTCCCG 
GTCGATTCCG 
CCACGCGGAA 
ACTTTTAAGA 
TGGTTTTGCC 
TCTCTCTCTC 
CCTTGGTGCC 
GCAAGTTTTC 
CTCCCGACCT 
AGCCACCATG 
TATGAACATA 
CACACACCCC 
TATGTTACTT 



CACAGGGATA 
CGATGTCGGC " 
ACTAATAGGG 
GATGATGTGT 
TTTGGTGTAT 
ACTGAACGCC 
TCGGTTGGCC ' 
CACGCGCCCC 
GGAGTGCCCT 
TCACGCACCG 
GGGTCGGGGT 
CTCGAGACAA 
TCGGCGCCGT 
CGTGGTGGTG 
TCCCGCCTCC 
GGGGAGCGTG 
GTCCCCGTCC 
CGCGCGTCCT 
GGCGGGCGGC 
CTCGGTCGGG 
CGCCGTGTCC 
GGGACGAACC 
CCGCCCGCGA 
GCCGCCGGAC 
CTCCACCGGC 
TTTTCAGCCC 
TCAGCCAGTA 
TTTTTTTTTT 
■ CGCTTGTCTT 
CTCTCTCTGT 
TCTCTCGCTC 
CTCTCTCTCT 
CTCCCTCCCT 
TTGAGACTTA 
CGGCGTGCCA 
ACCTCCGGGG. 
TCCCGGGTCG 
TCCCGTTCAC 
GAACTCCCTC 
AGAGGGAGTC 
GCTTGCCTTT 
CTTTCTTTCT 
TTTCTTTCTC 
TCTCTCTCTC 
CGTCCCTCCC 
ACTTAGCCGC 
TGCCACAGTG 
CGGGGTCGAC 
GGTCGACCAG 
GTTCACGCGC 
CTCCCTCTCC 
GGGAGTCACT 
TTGCGTTTTC 
TCTCTCTCTC 
TTCTCGGCTC 
TATTTCGAGT 
AGTGATCCGC 
CCCGGCCGAT 
CATCTACACA 
GTAGTGATAA 
TTAATGGATG 



12300 
12360 
^2420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
1314 0 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
,13740 
13800 
13860 
13920 
;13980 
-14040 
V 14100 
, 14160 
v4l422 0 
?tl4280 
14340 
14400 
14460 
14520 
14530 
14640 
14700 
14760 
14820 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
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' • AATATGTATC 
CTTCATTCAT 
CCGCCTGGTC 
AGGGATCTCT 
CAGGTTACGT 
CGATTGATTG 
^ ^ATTCTCATGG 
TTATTTCCTT 
CTGGCAGGGT 
CGACCAAACG 
CGCACCACCA 
GTGTGTGTGT 
ATGTATGTAC 
GCCCACGCTG 
CACTGCTGCT 
CGTGCCTGCC 
TTGTCCATGC 
GGCGTCATCT 
CGGCCTCCCG 
' " GTGGGTCGGT 
ATCTCACCGA 
TGTCGCCCAG 
TTCAAGTGAT 
GTGCCCGGCT 
GTCTTCAACT 
TCTTTTTTTT 
CCTCCTCCTC 
CTTTCAGCTG 
GCCTTGACTC 
AATGGAAAAG 
GCCGTCTCCC 
TTCGGTGCCG 
CAGCCGAGGC 
GACACGCCTT 
ACCTGCCACC 
GCTACCCTCC 
TCGGATCCTC 
GGACCACCCC 
CCCAGCATTG 
CGGTCTGCCT 
CCTTTCTCCA 
GACCTTCATT 
CAGATCAAAC 
TCTCTCTCTC 
TATCTAGTTC 
CGCCCCACCC 
ATCTGGGCCG 
GTGGATCGCT 
CTCTGAAAAA 
GGGAGACTGG 
TCGTGCCGTG 
TGTTATTATA 
GCTGAGACGA 
CCACTGTATC 
TTTCTTCCCT 
TTCTTTCTTT 
TGCCTTTCTT 
CTCCCAAAGT 
TCTTGGAAAG 
GCATCTCGCT 
GCTCCTGGAC 



GAAGCCCCAT 
TATTTATTAA 
'TTCTGTCTCT 
TAAGCCCGGG 
GGGCTGCGGT 
CGATCTCAAT 
GTTGTTCTGT 
CCTTCCTTCC 
CTTCCTCTGT 
GTCGTTCTGC 
CACCGGCTGA 
GTGTGTGTGT 
GTATGTATGT 
GTCTCGAACT 
ATTACAGGCG 
TGCCTGCCTA 
TCTGGGCACA 
CACGTGTCGA 
GAGTGCTGTG 
TCTTTCCGTT 
TCCGCCTTTT 
GGTGGAGTAC 
TCTCCTGCCT 
AATTTTTCTA 
TCCGACCGTT 
TCTTTTCTTT 
CTCCTCCTCC 
GGCTCTCCTA 
TTCTCCCGTC 
ATGAAAGAAA 
GGGGTGTACC 
AAACCTCCCG 
TCCCACCGCC 
CCAGATCTAT 
TTCCAGGGAG 
CCCGGCTGGC 
CGGCGAAGAC 
GGACCGTGCT 
TAAAGGGTGC 
TCAGCTGCCT 
GCACACAGAT 
TGTGGAATCC 
ACTATTTCCG 

tctctctctc 
acagagcaca 
tccacccgtt 
c5gcacgctag 
tggggccggg 
tagaacgatt' 
ggcgggcgac 

GCGATGCGGC 
AGATGAGTTG 
GGAGAAGATC 
CTGGGCAGTC 
TCTCTTTTCT 
CTTTCTTTCT 
-GTTTTCTTCT 
GCTGGGATGA 
TGAGACGCAG 
CCGTCACCCC 
TCGAGCGATC 



TTCATTTACA 
TAATTTTCGT 
GCGCTCTGGT 
AGGAGAGGTT 
GCGGTGGGGT 
TGCCTTTTAG_ 
"GTCATTGTCA 
TTCCTTCCTT 
CTCTGCCGCC 
GTCTGATCCC 
CTTTTATGTT 
GTGTGTGTGT 
ATGTATGTGA 
CCTGTCCTCA 
TGAGACGCTG 
TCAATCGTCT 
CGTGGTCTCT 
GGTGATCTCG 
ATGACACGCG 
TTTAATACGG: 
CGTTCTTTCT 
GATGGCGGCT 
CAGCCTTCGC 
TTTTTAGTAC' 
GGAGAATCTT 
TCTTTCCTTC 
TCCTCCTCGT 
CTTGTGTTGC. 
ACATCCGCCG 
TAAACACGAA 
TTGGACCCGG 
AGGGCCTCCT 
GCCCCTGGCA 
ATCCTGCCGG 
CTCTGAGGCG 
CTTTGCCGGG 
TTCCACCGGA 
GTTCTTGGGG 
GTGGGTATGG 
CAGGCGTGAA 
GAGACGCACG 
TCAGTCATCG 
GGTCCTCGTG 
TCTCGCACGC 
CTCACTTCCC 
GGCTGACGAA 
CTCACGCCTG 
AGTTCGAGAC 
AGCCGGGCCT 
TTGTTCCAAC 
CTGGATGACG 
TGCGCGGTGA 
ACTTGAGGCC 
ACCGGTCAAG 
TCTTTTTGCT 
TTTTCTTTTT 
TTCCTCCCTT 
CTGGCGGGAG 
AGAGCGCCTT 
GGCAGTGGTG 
CTTCCACCTC 



TACACGTGTA 
TTATTTATTT 
GACCTCAGCC 
AACGTGGGCT 
GGGGTGGGGT 
_ CTTCATTCAT- 
CGTTCATCGT 
CCTTCCTTCC 
CAGGATCACC 
TCCCATCCCC 
GTTTCTGATG 
GTGTGTGTGT 
GTGAGATGGG 
-AGCAATCCGC 
CGCCTGGCTC 
TCTTTTTAGT 
TTTCAAACTT 
AACTTTTAGG 
TGGGCACGGT 
GGACTGCGAA 
TTTTATTCTC 
CTGGGCTCAC 
GAGTAGCTGG 
AGATGGGGTT 
AACTTTCTTG 
TCCTCCCCCC 
CCTCCTCCTC 
TCTGTTGCTC 
TCTGGTTGTT 
GACGGAAAGC 
AAACACGGAG 
TCCCTCTCCC 
TTTTCCATAG 
ACGTCTCTGG 
GATGCGACCC 
CGACCCCAGG 
TGCCCCGGGT 
GTGGGTTGAC 
AAATGTCACC 
GACAACTTCC 
AGAGGGAGAA 
ACACACAAGA 
GTGGGATTGG 
GCACGCGCGC 
CTTTTCACAG 
ACCCCTTCTC 
TCACTCCGGC 
CAGGCTGGCC 
GGTGGCGTGG 
CGGGGAGGCC 
GAGCGAGACC 
TGGCCGCCTG 
CCACAGGTCG 
GAGATATGCC 
TCTCTTTTCT 
CTCTCTTCCC 
CCTCCCTTCC 
GCACCATGCC 
CCAGTGATCT 
CCGTCGTAAC 
AGCCtcCAGA 



TGTATATCCT 
TCTTTTCTTT 
TCCCAAATAG 
GTGATCGCAC 
GGGGTGGGGT 
ACCGTGTTAT 
TTGCTTGCCT 
TTCCTTCCTT 
CCAACCTCAA 
ATTACCTGAG 
TTTTCCGTAG 
GTGTGTGTGT 
TTTCGGGGTT 
CTGCCTGCCT 
CTTCTACATT 
ACGGATGTCG 
CTATGATTAT 
CTCCAGAGAT 
ACGCTCTGGT 
CGAAGAAAAT 
TTTAGACGGA 
CGCACCCTCC 
AATGACAGAG. 
TCTCCATCTT 
GTGGTGGTTG 
CCCACCCCCC 
CTCCTCCTCC 
ACGCTGGTCT 
GAAATGAGCA 
ACGGTGTGAA 
GGAGCTTGGC 
CCTTGTCCCC 
GAGAGGTATG 
CTCGGCGTGC 
CCACCCCCCC 
GGAACCGCGT 
GGGCCGGTTG 
GTACAGGGTG 
TAGGATGCCC 
CATCGGAACC 
ACAGCTCAAT 
CAGGTGACTA 
TCTCTCTCTC 
ACACACACAC 
TACGCAGGCT 
TACAATTGAT 
ACTTTGGGAG 
GACGTGGCGA 
GCTTGGAATC 
GAGGCCGCGA 
CCGTCTCGAG 
TAGTCGCGGC 
AGGCTTCGGT 
CCTTCCCCGT 
TTCTTTCTTT 
CTCTTTCTTT 
TTCTTTCCTC 
TGCTTGGCCC 
CATTGACTGA 
.TCACTCCCTG 
GTACAGAGCC 



TCCTCCCTTC 
TGGGGCCGGC 
CTGGGACTAC 
ACTtCCACTC 
GCAGAGAAAA 
TTGCTCGTTT 
GCTTGCCTGT 
CCCTCCCTTA 
CGCTTTGGAC 
ACTACAGGCG 
GTAGGTATGT 
GTGTGTATCTr 
CTATCATGTT 
CGGCCGCCCA. 
TGCCTGCCTG 
TCTCGCTTTA • 
TATTATTGTA 
CCTCCCGCAT 
CGTGTTTGTG 
TTTCAGACGC . 
GTTTCACTCT. 
' GCCTCCCAGG 
ATGAGCCATC:. 
GGTCAGGCTG 
TTTTCCTTTT 
TTGTCGTCGT 
TCTTTCATTT 
CAAACTCCTG 
TCTCTCGTAA 
CGTTTCTCTT 
TGAGTGGGTT 
GCTTCTCCGC 
GGAGAGGACT 
CCCACCGGCT 
GTCACGTCCC 
TGATGCTGCT " 
GGATCAGACT 
GACTGGCAGC 
TCCTTCCCTT 
TCTTCTCTTC 
AGATACCGCT 
GGCAGGGACA 
TCTCTCTCTC 
ACAATTTCCA 
GAGTAAAACG 
GAAAAAGATG 
GCCGAGGCGG 
AACCCCGTCT 
ACGACCGCTC 
TGAGCTGAGA 
AGAATCATGA 
TACTCGGGAG 
CGGCCGTGAC 
TTGCTTTTCT 
CTTTCTTTCT 
CCTGCCTTCC 
CCGCCTCAGC 
AAAGAGACCC 
TTTAGAGACG 
rCAGCGTGGAC- 
TGGC3ACCGCG 



15960 
16020 
16080 
16140 

162 go_ 

16260 
16320 
16360 
1644 0 
16500 
16560 
16620 
16680 
16740 
16800 
16860 
16920 
16980 
1704 0 
17100 
17160 
17220 
17280 ■ 
17340 
17400 
17460 
17520 
17580 
17640 
17700 
17760 
17820 
17880 
17940 
18000 
18060 
18120 
18180 
18240 
18300 
. 18360 
18420 
18480 
18540 
18600 
18660 
18720 
18780 
18840 
18900 
18960 
19020 
19080 
19140 
19200 
19260 
19320 
19380 
19440 . 
19500 
19560 
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GGCACGCGCC ACTGTGCCCA CACCGTTTTT AATTGTTTTT TTTTCCCCCG AGACAGAGTT 19620 

TCACTCTCGT. GGCCTAGACT GCAGTGCGGT GGCGCGATCT TGGCTCACCG CAACCTCTGC ■ 19680 

CTCCCGGTTT CAAGCGATTC TCCTGCATCG GCCTCCTGAG TAGCCGGGAT TGCGGGCATG 19740 

CGCTGCCACG TCTGGCTGAT TTGGTATTTT TAGTGGAGAC GGGGCTTCTC CATGTCGATC 19800 

GGGCTGGTTT CGAACTCCCG ACCTCAGGTG ATCCGCCCTC CCCGGCCTCC GGAAGTGCTG ' 19860 

GGATGACAGG CGTGAGCCAC CGCGCCCGGC CTTCATTTTT AAATGTTTTC CCACAGACGG - 19920 

GGTCTCATCA TTTCTTTGCA ACCCTCCTGC CCGGCGTCTC AAAGTGCTGG CGTGACGGGC 1998 0 

GTGAGCCACT GCGCCTGGAC TCCGGGGAAT GACTCACGAC CACCATCGCT CTACTGATCC 20040 

TTTCTTTCTT TCTTTCTTTC XTTCTTTCTT TCTTTCTTTC xxTCTTTCTT TCTTTCTTGA 20100 

TGAATTATCT TATGATTTAT TTGTGTACTT, ATTTTCAGAC GGAGTCTCGC TCTGGGCGGG 20160 

GCGAGGCGAG GCGAGGCACA GCGCATCGCT TTGGAAGCCG CGGCAACGCC TTTCAAAGCC 20220 

CCATTCGTAT GCACAGAGCC TTATTCCCTT CCTGGAGTTG GAGCTGATGC CTTCCGTAGC 20280 

CTTGGGCTTC TCTCCATTCG GAAGCTTGAC AGGCGCAGGG CCACCCAGAG GCTGGCTGCG -20340 

GCTGAGGATT AGGGGGTGTG TTGGGGCTGA AAACTGGGTC CCCTATTTTT GATACCTCAG 20400 

CCGACACATC CCCCGACCGC CATCGCTTGC TCGCCCTCTG AGATCCCCCG CCTCCACCGC . 20460 

CTTGCAGGCT CACCTCTTAC TTTCATTTCT TCCTTTCTTG CGTTTGAGGA GGGGGTGCGG 20520 

GAATGAGGGT GTGTGTGGGG AGGGGGTGCG GGGTGGGGAC GGAGGGGAGC GtCCTAAGGG 2 OS 80 

TCGATTTAGT GTCATGCCTC TTTCACCACC ACCACCACCA CCGAAGATGA CAGCAAGGAT 2064 0 

CGGCTAAATA CCGCGTGTTC TCATCTAGAA GTGGGAACTT ACAGATGACA GTTCTTGCAT 20700 

GGGCAGAACG AGGGGGACCG GGGACGCGGA AGTCTGCTTG AGGGAGGAGG GGTGGAAGGA 20760 

GAGACAGCTT CAGGAAGAAA ACAAAACACG AATACTGTCG GACACAGCAC TGACTACCCG 20820 

GGTGATGAAA TCATCTGCAC ACTGAACACC CCCGTCACAA GTTTACCTAT GTCACAATCT 20880 

TGCACATGTA TCGCTTGAAC GACAAATAAA AGTTAGGGGG GAGAAGAGAG GAGAGAGAGA 20940 

GAGAGAGAGA GAGAGAGAGA GAGAGAGAGA GAGAGAGAGG AGGGAGAGAG GAAAACGAAA 21000 

CACCACCTCC TTGACCTGAG TCAGGGGGTT TCTGGCCTTT TGGGAGAACG TTCAGCGACA ^ 21060 

ATGCAGTATT TGGGCCCGTT CTTTTTTTTT CTTCTTCTTT TCTTTCTTTT TTTTTGGACT 21120 

GAGTCTCTCT CGCTCTGTCA CCCAGGCTGC GGTCGCGGTG GCGCTCTCTC GGCTCACTGA 2118 0 

AACCTCTGCT TCCCGGGTTC CAGTGATTCT TCTTCGGTAG CTGGGATTAC AGGCGCACAC 2124 0 

CATGACGGCG GGCTCATATT CCTATTTTCA GTAGAGACGG GGTTTCTCCA CGTTGGCCAC 21300 

GCTGGTCTCG AACTCCTGAC CTCAAATGAT CCGCCTTCCT GGGCCTCCCA AAGTGCTGGA 21360 

AACGACAGGC CTGAGCCGCC GGGATTTCAG CCTTTAAAAG CGCGGCCCTG CCACCTTTCG 21420 

CTGTGGCCCT TACGCTCAGA ATGACGTGTC CTCTCTGCCG TAGGTTGACT CCTTGAGTCC -21480 

CCTAGGCCAT TGCACTGTAG CCTGGGCAGC AAGAGCCAAA CTCCGNNCCC CCACCTCCTC ,2154 0 

GCGCACATAA TAACTAACTA ACAAACTAAC TAACTAACTA AACTAACTAA CTAACTAAAA 21600 

TCTCTACACG TCACCCATAA GTGTGTGTTC CCGTGAGAGT GATTTCTAAG AAATGGTACT 21660 

GTACACTGAA CGCAGTGGCT CACGTCTGTC ATCCCGAGGT CAGGAGTTCG AGACCAGCCC 21720 

GGCCAACGTG GTGAAACCCC GTCTCTACTG AAAATACGAA ATGGAGTCAG GCGCCGTGGG 21780 

GCAGGCACCT GTAACCCCAG CTACTCGGGA GGCTGGGGTG GAAGAATTGC TTGAACCTGG 21840 

CAGGCGGAGG CTGCAGTGAC CCAAGATCGC ACCACTGCAC TACAGCCTGG * GCGACAGAGT 21900 

GAGACCCGGT CTCCAGATAA ATACGTACAT AAATAAATAC ACACATACAT ACATACATAC 21960 

ATACATACAT ACATACATAC ATCCATGCAT ACAGATATAC AAGAAAGAAA AAAAGAAAAG 22020 

AAAAGA AAG A GAAAATGAAA GAAAAGGCAC TGTATTGCTA CTGGGCTAGG GCCTTCTCTC 22080 

TGTCTGTTTC TCTCTGTTCG TCTCTGTCTT TCTCTCTGTG TCTCTTTCTC TGTCTGTCTG 2214 0 

TCTCTTTCTT TCTCTCTGTC TCTGTCTCTG TCTTTGTCTC TCTCTCTCCC TCTCTGCCTG 22200 

TCTCACTGTG TCTGTCTTCT GTCTTACTCT CTTTCTCTCC CCGTCTGTCT CTCTCTCTCT 22260 

CTCTCCCTCC CTGTTTGTTT CTCTCTCTCC CTCCCTGTCT GTTTCTCTCT CTCTCTTTCT 22320 

GTCTGTTTCT GTCTCTCTCT GTCTGTCTAT GTCTTTCTCT GTCTGTCTCT TTCTCTGTGT 22380 

GTCTGCCTCT CTCTTTCTTT TTCTGTGTCT CTCTGTCGGT CTCTCTCTCT CTGTCTGTCT 2244 0 

GTCTGTCTCT CTCTCTCTCT CTCTGTGCCT ATCTTCTGTC TTACTCTCTT TCTCTGCCTG 2 2 5 0 0 

TCTGTCTGTC TCTCCCTCCC TTTCTGTTTC TCTCTCTGTC TCTCTCTCTC TCCCCCTCTC 22560 

CCTGTCTGTT TCTCTCCGTC TCTCTCTCTT TCTGTCTGTT TCTCACTGTC TCTCTCTGTC 22620 

CATCTCTCfC TCTCTCTGTC TGTCTCTTTC GTTCTCTCTG TCTGTCTGTC TCTCTCTCTC 22680 

TCTCTCTCTC TCTCTCTCTC TCCCTGTCTG TCTGTTTCTC TCTATCTCTC GCTGTCCATC 22740 

TCTGTCTTTC TATGTCTGTC TCTTTCTCTG TCAGTCTGTC AGACACCCCC GTGCCGGGTA 22800 

GGGCCCTGCC CCTTCCACGA AAGTGAGAAG. CGCGTGCTTC GGTGCTTAGA GAGGCCGAGA 22860 

GGAATCTAGA CAGGCGGGCC TTGCTGGGCT TCCCCACTCG GTGTATGATT TCGGGAGGTC 22920 

GAGGCCGGGT CCCCGCTTGG ATGCGAGGGG CATTTTCAGA CTTTTCTCTC GGTCACGTGT ; 22980 

GGCGTCCGTA CTTCTCCTAT TTCCCCGATA AGCTCCTCGA CTTCAACATA AACGGCGTCC 23040 

TAAGGGTCGA TTTAGTGTCA TGCCTCTTTC ACCGCCACCA CCGAAGATGA AAGCAAAGAT 23100 

CGGCTAAATA CCGCGTGTTC TCATCTAGAA GTGGGAACTT ACAGATGACA GTTCTTGCAT 23160 

GGGCAGAACG AGGGGGACCG GGNACGCGGA AGCCTGCTTG AGGGRGGAGG GGYGGAAGGA 23220 
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GAGACAGCTT 
. GGTGATGAAA 
TGCTCATGTA 
GAGAGACGGG 
AGAGAGAGAG 
TTCTGGCCTT 

TTCTTCTTCT. 

GC GGT GCGQT 
TTCTTCGGTA 
AGTAGAGACG 
. TCCACCTTCC 
GCCTTTAAAA 
GTCCTCTCTG 
AGCAAGAGCC 
AACTAACTAA 
AAGAAATGGT 
CGAGACCAGC 
CAGGCGCCGT 
TGCTTGAACC 
TGGGCGACAG 
CATACATACA 
AAAAGAAAGA 
TGTCTGTTTC 
TCTGTCTGTC 
TGTCTGCCCT 
TCTCTCTCAC 
TCTCTGTCTG 
TCTTTTTCTG 
CTCTTTCTCT 
CTC TCTCTCT 
TTTCTCACTG 
TCTGTCTCTG 
CCGTCTGTCT 
CTCTCTCTCT 
CTGCCTCTCT 
TTACTCTCTT 
TGTTTCTCTC 
TGTCTCTCTC 
GTCTGTCTCT 
GTCTGTCTTC 
TCT CTCCCTA 
TTTCTCTCTC 
TCTCTGTCTC 
TCTTCTGTCT 
CTCTCTCTCT 
GTCTGCCTCT 
ATCTTCTGTC 
TCTCTCTCTC 
TTTCTCTCTG 
TCTCTGTCTC 
TGTGTGTCTG 
CCCTCTCTCT 
CTCTCTCTCT 
CTCTCTCTCT 
TGGCTGTCTG 
TCTGTTTCTC 
TCTCTTTCTT 
TCTTCTGTCT 
GTCCCTCCCT 
CTGTCTCTTT 
CACTGTGTCT 



CAGGAAGAAA 
TCATCTGCAC 
TGCTTGAACG 
GAGAGAGGGG 
AGAAAGAGAA 
TTGGGAGAAC 
-TTTCTTTGTT 
GGCGCTCTCT 
GCTGGGATTA 
GGGTTTCTGC 
TGGGCCTCCG 
GCGCGCGGCC 
CCATAGGTTG 
AAACTCCGTC 
AATCTCTACA 
ACTGTACACT 
CCGGCCCACG 
GGGGCAGGCA 
TGGCAGGCGG 
■ AGTGAGACCC 
TACATACAAC 
GAAAATGAAA 
TCTCTGTTCG 
TGTCTGTCTC 
GTCTCACTGT 
TCCCTCCCTG 
GCTCTCTCTT 
TGTCTCTCTG 
GCCTGTCTGT 
CTCTNNNCCC 
TCTCTCTCTG 
TCTCTCCCTC 
GTCTGTCTCT 
CTCTCTCTCT 
CTTTCTCTTT 
TCTCTGCCTG 
TCTCTCCCTC 
TGTCCGTCTC 
GTCTCTGTCT 
T GTCT TACTC 
CCTTTCTGTT 
TGTCTTTCTC 
TGTCTCTCTC 
TATTCTCTTT 
CTCTCTCTTT 
CTCTTTCTTT 
TTACTCTGTT 
TCTCTCTCCC 
. TCTCTCTGTC 
TGCCTCTCTC 
TCTTCTGTCT 
CCCTGCCTTT 
CTGTCTCTTT 
CTGTGTCTGT 
CCTGTCTCTC 
TCTCTCCCTC 
TTTCTCTGTC 
TACTCTCTTT 
CCCTGTCTGT 
CTCTTTCTCT 
GTCTTCTGTC- 



ACAAAACACG 
ACTGAACACC 
ACAAATAAAA 
GGAGAGGGGG 
GTAAAACCAA" 
GTTCAGCGAC 
-TTTTTTTGGA- 
CGGCTCACTG 
CAGGTGCGCA 
ACGTTGGCCA 
AAAGTGCTGG 
CTGCCACCTT 
ACTCCTTGAG 
CCCCCACCTC 
CGTCACCCAT 
GAACGCAGGC 
TGGTGAAACC 
CCTGTAACCC 
AGGCTGCAGT 
GGTCTCCAGA 
ATACATACAT 
GAAAAGGCAC 
TCTCTGTCTT 
TTTCTTTCTT 
GTCTGTCTTC 
TCTGTTTCTC 
TCTCTATCTG 
TCTGTCTCTG 
CTCTCTCTCT 
TCCCI'GTCTG 
TCTGTGTGTT 
TCTGTGTGTA 
CTCTCTCCCT 
CTGTCTCTGT 
CTGTGTCTCT 
TCTATCTGTC 
TCTCGCTCTC 
TGTCTTTTTC 
CTCTCTCTCT 
TCCTTCTCTG 
TCTCTCTCCC 
TGTCTGTCTC 
TCTCTCTCTC 
CTCTCTCTCT 
CTGCCTGTTT 
TTCTGCGTCT 
TCCTTGCCTG 
TCCCTTTCTC 
CATCTCTGTC 
TCTCTCTCTC 
TACTCTCCTT 
CTCTTTCTCT 
CTCTCTCTCT 
CTCTCTCTCT 
TCTCTCTCTC 
TCTCTCTCTC 
JTCTCTGTCTC- 
CTCTGGCTGT 
CTCTTTCTCT 
CTCTCTCTCT 
TTACTCTCTT ' 



AATACTGTCG 
CCCGTCACAA 
GTTCGGGGGG 
GGGGAGAGAG 
CCACCACCTC 
AATGCAGTAT 
CTGAGTCTCT 
AAACCTCTGC 
CCATGACGGC 
CGCTGGTCTC 
AAACGACAGG 
TCGCTGCGGC 
TCCCCTAGGC 
CCCGCGCACA 
AAGTGTGTGT 
TTCACGTCTG 
CCCGTCTCTA 
CAGCTACTCG 
GACCCAAGAT 
TAAATACGTA 
ACAGATATAC 
TGTATTCCTA 
TCTCTCTCTC 
TCTGTCTCTG 
TATCTTACTC 
TCTCTCTCTC 
TCTGTTTCTC 
TCTCTCTCTG 
CTGTCTCTCC 
TTTCTCTCTG 
TCATTCTCTC 
TCTTTTGTCT 
GTCCCTCTCT 
CTTTCTCTGT 
CTCTCTCTCT 
TGTCTCTCTC 
TCTGTCTTTC 
TGTCTGTCTC 
CTCTCTCTCT 
CCTGTCCATC 
TAGCTCTCTC 
TTTCTCTGTC 
TCTCTCTCTC 
CTCTCTCTCT 
CTCTCTCTCT 
GTCTGTCTCT 
CCTGCCTGTC 
TTTCTCTGTC 
TTTCTATGTC 
TCTCTCTCTC 
CTCTGCCTGT 
CTCTCTCTCT 
CTCTCTCTCT 
GTGCCTATCT 
TGTCTGTCTC 
TGTCTGTCTC 
TCTCTGTCTC 
CTGCCTCTCT 
CTCTCTCTCT 
CTCTCTCTCT 
-TCTCTTGCCT 



GACACAGCAC 
GTTTACCTAT 
GAGAAGAGAG 
AGAGAGAGAG 
CTTGACCTGA 
TTGGGCCCGT 
^CTCGCTCTGT 
TTCCCGGGTT 
CGGCTCATCG 
GAACTCCTGA 
CCTGAGCCGC 
CCTTACGCTC 
CATTGCACTG 
TAATAACTAA 
TCCCGTGAGG 
TCATCCCGAG 
CTGAAAATAC 
GGAGGCTGGG 
CGCACCACTG 
CATAAATAAA 
AACAAAGAAA 
CTGGGCTAGG 
TCTGTTTCTC 
TCTTTGTCCC 
TGTTTCTCTC 
TTTCTCTCTG 
TGTCTCTCTG 
TGCCTATCTT. 
CTCCCTTTCT 
TCTCGCTCTC 
TCTCTCTCTC 
TACTCTCCTT 
CTTTCTGTCT 
CTGTCCCTTT 
CTCTGTGCCT 
TGTCTGTCTC 
TCTGTTTCTC ' 
TCTCTGTCTT 
CTCCTTGTCT 
TGTCTCTCTG 
TCTCTCTCCC 
TGTCTGTCTC 
TGCCTCTCTC 
CTCTCCTTTA 
GTCTCTGTCT 
CTCTCTCTCT 
TGTGTGTCTG 
TCTCTCTCTC 
TGTCTCTCTC 
TCTGTCTGTC 
CCGTCTGTCT 
TTCTGTCTGT 
CTTTCTTTTT 
TCTGTCTTAC 
CGTCCCTCTC 
TTTCTCTGTC 
TGTCTCTCTT 
CTCTCTCTCT 
CTCTCTCTCT 
CTCTCTCTGC 
GCCTGTCTGT 



TGACTACCCG 
GTCACAGTCT 
GAGAGAGAGA 
AGAGAGAGAG 
GTCAGGGGGT 
T^TTTl'TTTC. 
CACCCAGGCT 
CCAGTGATTC 
TTCTATTTTT 
CCACAAATGA 
CGGGATTTCA 
AGAATGACGT 
TAGCCTGGGC 
CTAACTAACT 
AGTGATTTCT 
GtCAGGAGTT 
GAAATGGAGT 
GTGGAAGAAT 
CACTACAGCC 
TACACACATA 
AAAAGAAAAG 
GCCTTCTCTC 
TGTCTCTCTG 
TCTCTCTCCC 
CCCGTCTGTC 
TTTCTCTCTG 
CCCCTCTCTT - 
CTGTCTTACT 
.GCTTCTCTCT, 
TTTCTCTCTG . 
TCTGTCTCTG 'l 
CTCTGCCTGT ' 
CTTTCTCTGT 
CTGTGTCTGT 
ATCTTCTGTC 
CCTGCCTTTC 
TCTGTTTCTC 
TCTTTCTGTC 
CTCTCACTGT 
TCTCTCTCTC 
TGTTTCTCTC 
TTTCTCTCTG 
ACTGTGTCTG 
CTGTCTCTTT 
TTCTCTGTCT 
CTCTGTTCCT 
TCTCTCTCTC 
. TTTCTGGGTG 
TTTCTCTCTG 
TCTCTCACTG 
GTCTGTCTCT 
TTCTCTCTTT 
GTCTGTCTCT 
TCTGTTTCTC 
TCCCTGTCTG. 
TGTCTGTCTC 
TCTGTGC'CTA 
GCCTGTCTCC 
CCATCTCTGT. 
CTCTCTCTCT. 
CTGTCTGTCTr* ~ 



23280 
23340 
23400 
- 23460 
23520 
- -23580 
23640 
23700 
23760 
23820 
23880 
23940 
24000 
24060 
24120 
24X80 
24240 
24300 
24360 
24420 
24480 
24540 
24600 
24660 
24720 
. 24780 
24840 
24900 
24960 
25020 
25080 
25140 
25200 
25260 
25320 
25380 
25440 
25500 
25560 
25620 
25680 
25740 
25800 
25860 
25920 
25980 
26040 
26100 
26160 
26220 
26280 
26340 
26400 
26460 
26520 
26580 
26640 
26700 
26760 
26820 
26880 
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CTCTCCCTCC 
GTCTGTTTCT 
CTCTCTCTCC 
CCATGTCTCT 
GTTTCTTTGT 
CTCTCTGTTT 
CTCTGTCTCT 
GTCTTTCTAT 
CCTGCCCTTC 
CTAGACAGGC 
CGGGTCCCCG 
CCGTACTTCT 
GACGCCAACA 
GGGCAGGCCC 
6GAAGCGGAG 
TGAGTGAGAC 
TGCTGACGGA 
TTATTGAAAG 
ACGGGTTTCT 
CTCGCCTAGG 
CTTTGTGTTT 
CAAGTTGCCC 
TTAGGTGGGT 
CCTTCCAGAG 
CGATCTCATT 
CACCGTTTTT 
GGCATAGCCC 
CCCGCTGCTT 
TCACCTTTTA 
GAAGGGCGCC 
CTGAAAACTA 
GTTGGCTTGT 
TTTCTTTTCA 
CACGTTAGCT 
CAGTGAGAGC 
GTAGGCGAAT 
TGTTGTCGTT 
TTTATGGGAT 
CCCCTCTCTC 
CTCTCTCTCT 
CTCTGTCTCT 
TGTCTGTCTG 
TCTCTCTCTC 
TTTC CTTCTC 
TCTCTGCCTG 
TCTCTCTGTC 
TGTCTTTCCT 
CTCTCTCTCT 
CTCTCTCTCT 
CTCTCTGTGT 
TCTCTGTCTG 
CTCTCTTTCT 
CTCTGTCTCT 
GTGTATGTGT 
TGTCTGTCTG 
TTTCTCTCTC 
TTTATCTGTC 
TCTCTGTGTA 
TCTCTCTCTC 
TCCCGCCCTC 
GGCCTGAATT 



ATGTCTCTCT 
CTCTCTGTCT 
CTCTCTCTCT 
CTCTCTCCCT 
CTCTCTCTCT 
GTCTTTCTCC 
CTCTCTTTCT. 
CTCTCTCTCT 
CACGACACTC 
CGGCCTTGCT 
CTTCGATGCG 
CCTATTTCCC 
CGGCGAAACC 
TGTAATGCCA 
GCTGCAGGGA 
TCGGTCTCTA 
CATTTGCAGG 
TCGACGTTGA 
CTCTCTCCCT 
GAACCTCGGC 
TGGCGCCTAG 
CCCGGCTCCC 
TTCCCCCAAA 
CCACCCCGGT 
CTTGCCAGGC 
GAAGATGGGG 
CTTGACCCGC 
CCCGCTCAGG 
TCACGATGTT 
ACGGCTCTAG 
ATAACTTTNC 
TTTGTTTCGT 
GGTGAAGTAG 
GCCGTTTTTT 
CGGTTGATCT 
GCTGCTGCTG 
GTCGTTGTTG 
CAAAAGCATT 
TCTCTCTCTC 
CTCTCTCTCT 
CTCTGCCTGT 
TCCGTCTCTC 
TCTTTCTGTT 
TCTCTCTCTC 
CCTCTCTCTC 
TCTCTGTCCG 
TCTCTCTGTC 
CTCCCTGTCT 
CTCTCTCTCT 
GTCTCTCTTC 
TCTCTCTCTC 
GTCTGTTTCT 
CTCTCTCTCT 
CTCTCTGTGT 
CCTCTCTCTT 
TTTCTCTTTC 
TCTCTCCGTC 
TCGTTCTCTC 
TCTCTCTCTC 
TCTTTTTTTG 
CTTCACTTCT 



CTCTCTCTCA 
CTCTCTCTCC 
CTCTTTCTGT 
CTCACTCACT 
CTCTGTCTCT 
CTCCCTGTCT 
CTTTCTGTCT 
TTCTCTGTCA 
AGAACCCCGT 
GGGCTTCCCC 
AGGGGCATTT 
CCATAACTCT 
CCGTCTCTAC 
GCTCCTCGGG 
CCCGAGATCC 
AATAAATACG 
CAGGCATCGG 
CACGGAGGGA 
TCTGGAGGCC 
CCTGGGGGCC 
ACTCTTCTAC 
CCCACTACCC 
CCCCCCCCCC 
GTCCCTCCGT 
TCACATTTCC 
CCGGCACCGT 
GTGGGCAAGC 
CCTCCCTCCC 
TTAGTTTCTC 
TCTCGGCCTT 
TCACTTAAGA 
T CTGTTTTGT 
AAATCCCCAG 
CCTGTTGTGA 
TTACNATCCT 
CTCTTCTTCC 
TTGTCCTTGT 
ATAAAATATG 
TCTCTCTCTC 
CTCTCTCTCC 
CTCTCTCACT 
TCTCTCTCTC 
TCTCTCTCTC 
TCTGCCTGTC 
TCACTCTCTC 
TCTCTGTCTT 
TCTCTCTCTC 
CTCTGTCTCT 
CTCTGTCTTT 
TGTCTTACTG 
TCTCTCCCCC 
CTCTGTCTCT 
CTCTCTCTCT 
CTCTCTGTGT 
TCTCTCTCTC 
TCTCTCTCTC 
TCTCTCTTTA 
TCTCTCTCTC 
TCTCTCCGTC 
CAAAAGAAGC 
GACATCCCAC 



CTCACTCTCT 
CTCCATGTCT 
CTGTTTCTCT 
CTCTCTCCCT 
CTCTCTCTCT 
CTCTGTCTCT 
GTTTCTCTCT 
CTCTGTCAGA 
CCTTCGGTGC 
ACTCGCTGTA 
TCAGACTTTT 
GCTCGACTTC 
TAAAAATACA 
AGGCTGAGGC 
CGCCACTGCA 
GAAATTAATT 
TTGTCTTCGG 
GGTCTCGCCG 
CCTCCCTCTC 
CTATTGTTCT 
TTGGGCTTTG 
ACGTCCCTTC 
CCCCCCGCCT 
CTTCTCTCCC 
ATCGGTGGGC 
CCCACTTCCC 
GGGCGGGTCT 
TAGGAAAGCT 
CGCCCTCCGG 
CTCAGTACTT 
TTTCCAGGGA 
TTTGTTCGTG 
TTTTCAGGAA 
ACTACCGCTT 
TCATCATCAC 
TGTTCTTGTT 
CGTTGTTTTC 
TGTGATTATT 
TCTCTGTCTT 
CTCTCTGTTT 
CTCTCTCTCT 
TCCCTGTCTG 
CGTCTCTGTC 
TCTCTCACTC 
TCTCTCTGTC 
TCTCTGTCTG 
TCACTGTGTC 
CTCTCTCTCT 
GTCTTTCTTT 
TCTTTCTCTG 
TGTCGGCTGT 
CTTTCTCTCT 
CTCTCTCTCT 
CTGCCTTCTC 
TCTCTGCCTG 
TGTCCATCTC 
TCTCTCTCTC 
TCTCTCTCTC 
TCTCTGTCTG 
TCAAGTACAT 
ATTTGATCTC 



CTCCGTCTCT 
CTCTCTCTCT 
CTCTGTCTCT 
CTCTCTCTCT 
CTCTCTCTCT 
CTCTCTCTCT 
ATCTCTCGCT 
CACACCCCTG 
TTAGAGACGC 
CCATTTCCGG 
CTCTCGCTCA 
AACATAAACT 
AAGCTGAGTC 
CGGACAATCG 
CTACCCCCCA 
AATTCATTAA 
GGATCACCTA 
ACTTCACCGA 
TCCCTCGTTG 
TTGATCGGCG 
CCAACGGTCA 
ACCTTAATTT 
CCCAACACCC 
CTTCCCCCAC 
CTCACCCCTC 
CGCACGCACC 
CCACTTGTGA 
TCACCCTGGC 
CCACCAGACT 
CCCCAAAATA 
CGGCGCCTTG 
TTTTTCCTTT 
CACCTCTATT 
TTCTGACTCT 
ATCTTATTTT 
CTTGTTGTTG 
AAAGTATACC 
TCTTGAGCAC 
TCTCTCTCTC 
CTCTCTCTCT 
TCTGTCTTAC 
TATGTTTCTC 
TTT CTCTGAC 
TGTCTTCTGT 
TCTCTCTCTC 
TCTCTTTGTC 
TGTCTTCTGT 
CCCCCTCTCT 
CTCTCTCTCT 
CCTCTCTGTC 
TTCTCTGTCT 
CTGTCTCTTT 
CTCCCCCTCT 
TCTTACTCTC 
TCTCTCTCCC 
TCTCTTTCTC 
TCTCTCTTTC 
TGTCTGTCTG 
GGTCTCTGCG 
CTAATCTAAT 
CCTACAGAAT 



CTCTCTTTCT ' 26940 

CTCTCACTCA 27000 

CTCTCTCCCT 27060 

CTTTCTGTCT. 27120 

CTCTCTCTCT 27180 

CTCTCTCTCT 2724 0 

GTCCATCTCT 273 00 

CCGGTAGGGC 27360 

CGAGAGGAAT 27420 

ACGTCGAGGC 27480 

CGTGTGGCGT 27540 

GTTAAGGCCG 276 0 0 

GGGACCCGTG 27660 

CTTGAACCAG 27720 

GGCTGTACAG 27780 

TTCTTTTCCC 2784 0 

GCCGCCACTG 27900 

GCCTGCGGCA 27960 

CCTAGGGAAC 28 02 0 

CTTT ACTTTT 28080 

GTTTAATTTT 2814 0 

ACTGAGNCGG 28200 

TCCTTGCAAA 28260 

CCCTTCCCGG 28320 

ACTCCCCCGC 28380 

TTCCCCCCAT 2844 0 

GCCTTTTGCC 28500 

TGGGTCTCGG 28560 

TTCACAATGC . 28620 

GAAACGCTTT ,28680 
GCCCGTGTTT, 28740 
CTCGTATGTC , ■ 28800 
TTCCCCAAGA,p.i 28860 

CTCAACGCTG 28920 

CTAGAAATCC 28 980 

TCGTCGTTGC 29040 

CCCCCCACCC 29100 

CCCCTTCCTC 29160 

TCTTCTCTCT 29220 

GCCTCTCTCT 292 8 0 

TCCCTTTCTC 29340 

TCTCTCTCTC 294 00 

TCTCTCTCTC 29460 

CTTATCTCTC 29520 

TCTTTCTGTT 29580 

TCTCTGTCTT 2964 0 

CTTAGTCTCT 29700 

GTTTCTCTCT 29760 

CTCTCTCTCT 2 9820 

TCTCTGTCTG 2 9880. 

CTCTCTCTCT 29940 

CTCTCTCTCT 30000 

CTCTGTCTCT 30060 

TTTCTCTGCC 3012 0 

TTCCTCTCTG 30180 

CCTCTCTCTC 3024 0 

TCTCTTTCTC 30300 

TCTCTCTCTC 30360 

TCTCGCTATC 30420 

CCCTTACCAA 30480 

GCTGTACAGA 3054 0 



wo ^7/40183 



PCT/US97/05911 



-220- 



ACTGGCGAGT 
■ \ AATCCTAAAA 
CCTAGGATGC 
CCGAGCCCTG 
,.. v . ACTACCCAGG' 

ggctttttgg 

■ catgcttgct 

. tgtcccaccg 
cgtcaccaac 
• gactcttggg 
aaggtcccac 
acgtcccgac 
acgtcccggg 
caccccacca' 
gagcctgacg 
ggggttgtgg 

- C3AATATGGCT 
TCAAACCCTC 
GTGCTGGATG 
TGGCTTTGTG 
GCAGTGGCGT 
; TCTCAGCGCC 
TCACCCTCTT 
CCACAGAGAG 
ATTTGAGTGG 
CTGTGCTAAT 
AAAGTTGCTC 
AAGCTGGCCG 
: TTTTTTAAGA 
ACAACACAAG 
ACTGAGTTCT 
' GTTGATTGTT. 
TGTAACTACT 
GTGTTTCAGT 
TGTAATAAGT 
TTGAGAATCA 
TCTf GAGACG 
CTGCAACCGC 
GGACTACAGG 
TTCACCGTGT 
TCCCAAAGTG 
ACTATGAAGT 
GAAGTAGGAC 
ATTTCCTATG 
TAAAGGGTAC 
CAG CTAGTGA 
TTTCCAGTCT 
AGCAAATCAA 
CTGCAAAAGT 
ATGAGTTCAC 
AGTTATTGAG 
GCTATCTAAC 
TTCATTAGCA 
GAAGCTAATG 
CATAAGCATC 
GTTCCCGTTT 
.^?TTTATCCA 
GCCAAAGTGT 
CCTCTGAGAA 
ATTTGGCACA 
ATAGCTTCTT 



TGATTTCTGG ACTTGGATAC 
TCTGGGGTGG CTTCTCCCTC 
CGGAAGAGTT TTCTCAATGT 

rnnn^^^'^^'^ CTCAAATATG 
GCCCCTTGTG GAACCACTGG 
CTAGGAGGCC TAAGCCTGCT 
AGCGGTGGAT- GAGTCTCTGG 
AGGTCAAATG GATACCTCTG 
CGTCACCGTC AGCATCCTTG 
AGCCCGGCCT TCGTCGGCTA 
TGAACGGCGA AGATGTGGAG 
AGGCGACGAG TTCCCAAGGC 
, CACCCGCGGG ACACCGCCGC 
n"^^^^^^ GCACACACGC 
GAGCGAGAGC CCATTTCACG 
S^IS^^^^^C GAGCCCGATT 
TCTTGGGGGG AGGGGCTTCC 
CCTTGAGGCC ACAAAATAGA 

r^IS^^^^^ agagacctSJ 

Jn^^S^^ TTTCTGAGAT. 
GATCTCAGCT CACTGGAACC 
ACCATGGCCG GCTCATTTTT 
TCATTGGTTT TCACTGGAGA 
AGTTCTTTTT TTTTTTTTTT 
CTTCCTATAT CATTATAATT 
GATAGTGAAA GTGAAGACAA 
^2^'^'^'^^^ AGCTACCTAA 
ATCTQAATAA TCCTCCTTTA 
ATGCGACTCC TGCAAAATAG 
GATCAACCAG ACTTGGGAAA 
I™^^^^ CGGAGAACGT 
ACGTTGGTCA GCAGTAGCTG 
ACAGCAAAAT GAGATATGAT 
AATATAATGC TTCAGATTTA 
CACCCCAAAG ATCACCGTAT 
TACTTTCTTC TTGATATTTA 
CGTCTCGCTC TGTCGCcSS 
CACCTCCCTG GGTTCAAGCG 
r^^SS^^^ CACGGCCAGC 
CGGCCCGGAT GGTCTCGATC 
CTGGGATGAC AGGCGTGAGC 
CAGTCCAGAG AAACGCAATA 
CACACTTTTT CCTATCTTAT 
TGCCTACTTA TACACGAGTA 
f^f^"^ TCATAGTAAG 
ATTGTTTCCA TGTATTTTTC 
CCCAAGCACT TCTTGTGCCC 
SS^^^^ CTAAAGAAAC 
TTGCTAGAAG. ACTGAAACTG 
TTCAGAGTTT GTTCAAGACA 
CAGTAGGTAC CATCCCTAAG 
CAGAAAAATT AGCGAGTACG 
^I^ACCATGC CTTACAATGT 
CTTTGTCCAG TTCTTCAGTG 
ATTTGGATCC ACTTCGAGAG 
GCAGACCGAA ACAGTTTCCC 
-GTCTGTGAAG-TCTTTGGACA 
TGTAGAGTAG ATCTCCATGC 
TTGTCTTTCA. GCTTGCGTGG 
S^^S^^^^ GGTATTGCAG 
TGCCGTGGTA-AGAACACAAA^ 



CTCATAGAAA 
GACTGTCTCG 
GCATCTGCCC 
TACGTGCAAA 
CTCTTTGAAA 
'_GAGAACTTTC 
AAGGACGCAC 
CATTGGCCCG 
TGAGCCTGCC 
AAGTCCAAAG 
CGTAGGTCAG 
TCTGGCCACC 
TTTATCCCCT 
TGGAGGTTCC 
AGGTGGGAGG 
CTCCGTCTTG 
TTAGGCCATC 
TTCCACCCCA 
GCCTGACACC: 
GGAGTCTTGC 
TCTGCCTCCT 
TTTTTTTTTT 
TTCTAGATTC 
'TTTTTAAGCG 
GTGTTATAGA 
AAGAAAGGCT 
ATACGTCAGC 
AACAAACACA 
CTGAACAGAC 
AAATCGAAAA 
AGCTATCGGA 
GCACTATCTT 
CCATTAAACA 
GAAGCAAATC 
CTGACAAAAT 
CTTATGTATT 
GCTGGAGTGC 
ATTCTCCTGC 
TAATCTTTAT 
TCTTGACCTC 
CACTGAGCCC 
AATGTCAACG 
TCAGTTGATA 
GAAAAGAGTA 
TCCGTAAACT 
TATTATCCAA 
ATCACCACTT 
ACACACACAA 
TTGAGTATAA 
TACGTTTCGT 
TATTTTTCAC 
GGCACCATCC 
CTAGGATTGA 
AAGACAACTC 
TTCTCTGGAA 
TGCAGCACAC 
GAACTGAAAG 
CTTCGACTCT 
ACTCTGAAAG 
TGGTGAGAAG 
GCTAAATAAC' 



CTACATATGA 
AAAAATCGTA 
GTGTCCTAAG 
CACTTCTCTC 
AAAATCCCAG 
-CTGCCGAGGA- 
GGGACTCCGC 
AGGCCTCCGA 
CAAGGCCCCG 
GGATGGTGAC 
AGAGGGGACC 
CCACCCACGC 
CCTCTGTCCA 
AAAACCACAC 
GGTGGGGGTG 
GGTGGCTACA 
ACCGCTTGCG 
CCCATCGACG 
GTCGAATTAA 
TCTGTCCCCC 
GGGTTCAAGT 
TTTTTGGTAG 
GAGCCACACC 
CAACGCAACA 
TGAAGAAACG 
ATCTATTTTG 
ATTTACACTC 
ATTTTTGATA 
GATACACATT 
CCACACAAGT 
AGAGAAGGCA 
TTTGGCCATC 
ACATATTCGC 
AAATGATAGA 
AACTACCACA 
TATTTTTTTT 
GATGGTGTGA 
CTCAGCCTCC 
ACTTTTAATA 
GTGACCCGCC 
GGCCTTCTCT 
GTGAGGATGG 
ACAATATGAC 
AAACAGAGAG 
GGAACACTGT 
TAAGTGAACT 
CGGTGCTCGA 
ACCAAAGACA 
GGATCTGGTA 
AAGGAAACAT 
CAAATCCGTG 
ATAGGGCTTT 
CCCTGATAGC 
ACGCCCTAAT 
GAATTGAATC 
-CAGGCCTCTG- 
AGCAACCTCT 
GTAATTCTCA 
TTTACAATAG 
C TAG ATGGGT 
CTTTCCCCCT 



ataaagatcc 
cctctgttcc 
tgatctgtga 
catttccaca' 

AAGTGGTTTT 
TCCTCGGGAC" 
AAAGCTGACC 
AGTACATCAC 
CCTCCGGGGA 
TTCCACCCAC 
AGGAGGGGAG 
CCCACGCCCC 
CAGCCGGCCC 
GGTGTGACTA . 
GGGTGGGTTG 
GGCTAGAAAT 
GGACTACCTC 
TTTCCCCCGG 
ACACCTTGAC 
AGGCTGGAGT 
GATTCTCCTG 
ACACGGGGTT 
TCATTCCGTG 
TGTCTGCCTT 
GTATTAAACA 
TGGTTAGAAT 
TTGCTAGTAA 
GGGTTAAGAT 
TAAAAAAATA 
CTTATGAAGA 
GTATTGGCAA 
TTTCGGGCAA 
AAATCAAAAA 
ACTCCACTGG 
GGGTTATGAC 
AATTTATTTC 
TGTGGGCTCA 
CGAGTAGCTG 
GAGACGGGGT 
CGCCTCGGCC 
TGACGTTTAA 
TGTTGAGGCA 
CTAGGTAGTA 
ACTGCTAAAT 
CAAAAAGCAG 
ATGCTATTCC 
AGAAAAAGTA 
ACTACAGCGT 
TTCTACGATC 
CTTAGTTAGA 
ACAATAAAGA 
G TCT TTACGC 
ATTTCGAAAA 
GCGCTATAGG 
GCAATATCGT 
GGTGGCGAAT- 
TTCGGAGGAT 
ATCCTCCTAA 
GCCNTTTCCG 
CAAGATGGTG-- 
TTCACGAAGA 



30600 
30660 
30720 
30780 
^30840^ 
30900 
30960 
31020 
31080 
31140 
31200 
31260 
31320 
31380 
31440 
31500 
31560 
31620 
31680 
31740 
31800 
31860 
31920 
31980 
32040 
32100 
32160 
32220 . 
32280 
32340 
32400 
32460 
32520 
32580 
32640 
32700 
32760 
32820 
32830 
32940 
33000 
33060 
33120 
33180 
33240 
33300 
33360 
33420 
33480. 
33540 
33600 
33660 
33720 
33780 
33840 
33900 " " 
33960 
34020 
34080 

3414 0 

34200 
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AGGCTCATCA AGCCTTCCGC TGCTGCTTTT. TGTAGATTAA AAGCCTGAAT CTGAGGCGCG . 34260 
: ATTGCGGCTA TTTTCCCTTC TGAAATGACG- GAAGAGTCCA ATTTTGTCAC TTCCAGGCTA 34320 
TCACTTATGT TCGGTGGAGT TATTGCTCCT TTATTAGTTT TACTTTTGGT TCTTCTGTTT 34380 
GGGATTTTAG GTGGAAACTT CATTTTTAAT TTTCTCCTAA TTCTCCTCGG TTGTGGAGCT. 34440 
GTCACTAGTC AAGAGTCGTG AATTTCTTCG AGGNCGGTGC ATTTGGGGGA GATGCCATAG 34500 
TGGGGCTCAA TACCTGAGGT GTTGCCCTTG TCGGCGGACC AGAAGTTTGT GTTTTTGCAA 34560 
GGACTGGAGT TACCTTTCGG CTCTTTCCCC TCTGCGAGAA GACAGACGGT GTTCCGGTTT 34620 
GGCGGATTCT GGCAACAGGC TTTTCTGAAG GGGCTCCGGT GGATGGCACG TCAGTGACAG 34680 
ACGGTGTCTC ATACCAGTGC AGTTTTGTCA ATAGGGTCCG TCTCCGGGAC TTGGGGTTTC 34 74 0 
TAATGGCAAA ATGCCAACAC TTGGGGTTAA TGGACTAACA GCTGCTGGTC CTCCTAATAA 34800 
ACTTCGACCA GTTTTTGGTT TATGTTGAAC CTGTTTAGAT CATATGGAAG TTCCTGTTCC 34860 
, CAGTGGGACA GTATCAGGTG AAAGGACAGC TGAATCGATA GAAGACACTG GGGAGTCTGT 34920 
ATTCAAGGAG TACTTTGAAT TGGAAGATTC TAAATTCCAT CCGTTTCATT CGACGGTGTC 34980 
CTGGGGTGTT TCCGTAAGAA CGGTCTCGGG CTGTCTGTGA CATAAACTAG GACGAGGTCC 35040 
AAGTGTTGTG GCGCAACACT TGGACAGGCA GTTGCTAAAG CTCTCTAGAG AGGTGAATGA . 3 5 1 0 0 
AAATGTTTGG TCAGGATCTG GCTTTTCCCC CCTATTTCAC ATCATGATTC AAAGGGACAC 35160 
CAGAGGAAAG GATTTCAACG AAGGCTCTTT TGGTCACATT CTGATCCTTT GGTAAGCCGA 35220 
TCTGTCTTGC AATATACATG TCCCGACGAT GGAAGGGGAA AGCGAGCTGA ATCACCAAAC 35280 
TCAGGAACGA TAATATCATC GTGGCTTTTC TGCTTATGAA ACACTCCACC CGATAAGATT 35340 
TGATCCCCTT CTGCAAGCTT GCTGAGATCA ACACAACATT TCGCAAGCAG GCATTTGCAT 35400 
TGCGGGGTAG TACAACTGTG TCCTTTCAAG AGTCTATATG TTTTATAGGC CTTTCCTGAG 35460 
CGGTAAGAAC AGGTCGCCAG TAAGAACAAG GCTTCTTCTG AGTGTACTTC TGCATAAAGG 35520 
CGTTCTGCGG GGGAAACCGC ATCTCGGTAG GCATAGTGGT TTAGTGCTTG CCATATAGCA =35580 
GCCTGGACGG GTCCCTGCAG CACCGCCATC CTCGAGGCTC AGGCCCACTT TCT GCA GTGC 3564 0 
CACAGGCACC CCCCCCCCCC CATAGCGGCT CCGGCCCGGC CAGCCCCGGC TCATTTAAAG ^.35700 
GCACCAGCCG CCGTTACCGG GGGATGGGGG AGTCCGAGAC AGAATGACTT CTTTATCGTG " 35760 
CTGACTCTGG AAAGCCCGGC GCCTTGTGAT CCATTGCAAA CCGAGAGTCA CCTCGTGTTT 35820 
AGAACACGGA: TCCACTCCCA AGTTCAGTGG GGGGATGTGA GGGGTGTGGC AGGTAGGACG 35880 
AAGGACTCTC TTCCTTCTGA TTCGGTCTGC ACAGTGGGGC CTAGGGCTGG AGCTCTCTCC J 35940 
GTGCGGACCG CTGACTCCCT CTACCTTGGG TTCCCTCGGC CCCACCCTGG AACGCCGGGC x 36000 
CTTGGCAGAT TCTGGCCCTT TCTGGCCCTT CAGTCGCTGT CAGAAACCCC ATCTCATGCT 36060 
CGGATGCCCC GAGTGACTGT GGCTCGCACC TCTCCGGAAA CATTGGAAAT CTCTCCTCTA {36120 
CGCGCGGCCA CCTGAAACCA CAGGAGCTCG GGACACACGT GCTTTCGGGA GAGAATGCTG S;3 6 1 8 0 
AGAGTCTCTC GCCGACTCTC TCTTGACTTG AGTTCTTCGT GGGTGCGTGG TTAAGACGTA '^3624 0 
GTGAGACCAG ATGTATTAAC TCAGGCCGGG TGCTGGTGGC TCACGCCTGT AACCCCAACA 36300 
CTTTGGGAGG CCGAGGCCGT AGGATCCCTC GAGGAATCGC CTAACCCTGG GGAGGTTGAG ^ 36360 
GTTGCAGTGA GTGAGCCATA GTTGTGTCAC TGTGCTCCAG TCTGGGCGAA, AGACAGAATG 36420 
AGGCCCTGCC ACAGGCAGGC AGGCAGGCAG GCAGGCAGAA AGACAACAGC TGTA TTATG T 36480 
TCTTCTCAGG GTAGGAAGCA AAAATAACAG AATACAGCAC TTAATTAATT TTTTTTTTTT 36540 
CCTTCGGACG GAGTTTCACT CTTGGTGCCC ACGCTGGAGT GCAGTGGCAC CATCTCGGCT 36600 
CACCGCAACC TCCACCTCCC GCGTTCAAGC GATTCTGCtG CCTCAGCCTC CTGAGTAGCT 36660 
GGGATTACAG GGAGGAGCCA CCACACCCAG CTGATTTTGT ATTGTTAGTA GAGACGGCAT 36720 
TTCTCCATGT GGGTCAGGCT GGTCTCGAAC TGGCGACCCC AGTGGATCTG CCCGCCCCGG 36780 
CCTCCCAAAG TGCTGGGGTG ACAGGCGTGA GCCATCGTGA CTGGCCGGCT ACGTTTATTT 36840 
ATTTATTTTT TTAATTATTT TACTTTTTTT TAGTTTTCCA TTTTAATCTA TTTATTTATT 36900 
TACATTTATT TATTTATTT A TTTATTTACT TATTTATTTA TTTTCGAGAC AGACTCTCGC 36960 
-TCTGCTGCCC AGGCTGGAGT GCAGCGGCGT GATCTCGGCT CACTGCAACG TCCGCCTCCC 37020 
GGGTTCACGC CATTCTCCTG CCTCAGCCTC CCAAGTAGCT GGGACTACAG GCGCCCGCCA .37080 
CCGTGCCCGG CTAACTTTTT GTATTTTGAG TAGAGATGGG GTTTCACTGT GGTAGCCAGG 3714 0 
ATGGTCTCGA TGTCCTGACC CCGTGATCCG TCCACCTCGG CCTCCCAAAG TGCTGGGATG 37200 
ACAGGCGTGA GCCACCGGCC GCGGCCTATT TATCTATTTA TTAACTTTGA GTCCAGGTTA 37260 
-TGAAACGAGT TAGTTTTTGT AATTTTTTTT XTTTTTTTTT TTTTTTGAGA CGAGGTTTCA 37320 
CCGTGTTGCC AAGGCTTGGA CCGAGGGATC CACCGGCCCT CGGCCTCCCA AAAGTGCGGG 37380 
GATGACAGGC GCGAGCCTAC CGCGCCCGGA CCCCCCCTTT CCCCTTCCCG CGCTTGTCTT 37440 
CCCGACAGAC AGTTTCACGG CAGAGCGTTT GGCTGGCGTG CTTAAACTCA TTCTAAATAG .37500 
AAATTTGGGA CGTCAGCTTC TGGCCTCACG GACTCTGAGC CGAGGAGTCC CCTGGTCTGT 37560 
CTATCAGAGG ACCGTACACG TAAGGAGGAG AAAAATCGTA ACGTTCAAAG TCAGTCATTT 37620 
TGTGATACAG AAATAGACGG ATTCACCCAA AACACAGAAA CCAGTCTTTT AGAAATGGCC 37680 
TTAGCCCTGG TGTCCGTGCC AGTGATTCTT TTCGGTTTGG ACCTTGACTG AGAGGATTCC 37740 
CAGTCGGTCT CTCGTCTCTG GACGGAAGTT CCAGATGATC CGATGGGTGG GGGACTTAGG .378 00 
CTGCGTCCCC CCAGGAGCCC TGGTCGATTA GTTGTGGGGA TCGCCTTGGA GGGCGCGGTG 37860 
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* ' ACCCACTGTG 

TTCATTCCGG 

• CTCTGAAAAC 
GCAACTGTGT 
CTAGGAAATC 
AAACAGATAA 

_ _ ^^ACCCATTACA 
. ATACAATAGG 
ATAC7VATACA 
: GATGCCGAGG 
: AGAAATCCCG 
TCCCAGCTGC 
AGTGAGCCGA 
AAAAATAAAT 
ATAAATTAAA 
CCTGTCATCC 
AGGGCCAGTA 
GCTGTGCTGT 
GCTTGAACCT 
GGGCGACAGA 
AATTT^AAAAG 
AGAACAACCC 
AGGAATTATG 

tcgagacgga: 
ccctggctgg 
ctttaacccg 
ttgttgttgt 
ttgcc tggcc 

TTTTTCTTCT 
AQAGGGCAAT 
TCC TGCCTCA 
TTTGTACTTT 
ACCTCAGGTG 
CGCCCAGCCT 
TTCTTGCTTT 
TTG CTTGCTT 
TTT CTTTCTT 
CTTGCTTTCC 
GCTTGCTTGC 
T GCT TGCTTT 
TCTTTCTTTT 
TCTCGATTTC 
TGC TTTCTTG 
TTTC TTGCTT 
GTTTCTTTCT 
CGTGCTTTCT 
CTTTTCTTTC 
TTTCTTTCTT 
TTCACTCTTG 
CCTCCCGGGT 
AGGCACCCCC 
CCATGTTGCT 
TCGAAGTGCT 
TTTATTTCTT 
TTATATGCAA 
CGTATCGGTT 
ATAAATACAC 
AAAAGCGTCG 
TCTTCCTCTC 
TTCTTCCTCT 
TTCTCTTTCG 



CTGTGGGAGC 
GCTGACACGC 
GGAGGCCTCA 
CTTCTCCACC 
GCCACTTTGA 
ATAAATAAAA 
ATACAATAAG 
ATACGATACA 
ATACAATACG 
TGGACGCATC 
TCTCAATTGA 
TAGGAAGGCT 
GATT6CGCCA 
ACATAAATAA 
ATAAATAAAT 
CCTCACTTTG 
TGGTGAAACC 
ACTGTCTGTA 
GGGAGGCGGA 
GCGAGACTCC 
TGAGTTTCTG 
CACCGTGACA 
CGTGATTTCT 
GTCTCGGAGG 
GCCCGATTGT 
CGTGGACTCT 
TGGGGACTTT 
TTGCCTGGCC 
TCTTCTTCTT 
GGCGCGATCT 
GCCTCCTGAT 
TAGTAGAGAC 
GTCCGCCTGC 
CTCTCTCTCT 
CCCGTTTTCT 
GCTTGCTTTC 
TTGTTTCTTT 
TGTTTTCTTT 
TTTCGTGCTT 
C TTG CTTGCT 
GTTTCTTTCT 
TTTCTTTCTT 
CTTTCTTGTT 
TCTTGCTTGC 
TGCTTGCTTT 
TTCTTGCTTT 
ATCATCATCT 
TCT TTCTTTC 
TTTCCACGGC 
TCGAGCGCTT 
ACGCCTGGCT 
CAGGCTGGTC 
GGGATGACGG 
TCGTTTCCAC 
ACAACQACAA 
GTATGGAAAT 
ATGGCTCTAT 
TATTTATGTG 
CTTCGTGTTT 
CTTCCTTTCC 
TTCCCTGTGT 



CTCCATCCTT 
TCACTGGCAG 
CAGAGGAAGG 
GCCCCCGCCC 
CGACCGGGTC 
TAACACAAAA 
ATACGATACG 
ATACAATACA 
CCGGGCGCGG 
ACCTGAAGTC 
AAATACAAAA 
GAGGCAGGAG 
TCGCACTCCA 
ATACATACAT 
AAAATAAAAT 
GGAGGCCAAG 
CCGTCTCTAC 
ATCCCAGCTA 
GGTTGCAGTG 
GTCTCCAAAA 
GGGAAAAAGA 
TACACGTACG 
TTTTTTAACT 
CCCGCCCTCC 
TCTTCTCCTT 
TCCGCCTCGG 
CCTGATTCTC 
TTGCCTTTTC 
CTTTTTTTTG 
CGGCTCACCG 
TAGCTGGGAT 
GGTGTTTTTC 
CTTAGCCTCC 
CTCTCTCTCT 
TGCTTTCTTT 
GTGCTTTCTT 
CTTGCTTGCT 
GTTTCTTTCT 
TCTTGTTTTC 
TGCTTTCGTG 
TGCTTGCTTT 
TTGTTTCTTT 
TTCTTTCTTT 
TTGCTTTCGT 
CTTGCTTGCT 
CTTTTCTTTC 
TTCTTTCTTT 
TTTCTTTCTT 
TAGAGTGCAA 
CTCCTGCCTC 
TGGCTGATGT 
TCCAACTCCC 
GCGTGACGAC 
GCGTTTACTT 
CGTGTATCTC 
AGACTTCTGT 
AAAGAAGGGA 
TGTAAATGAA 
TTCTTCCTTC 
TTCTTTCTCT 
TTCCTTCTTT 



CCCCCCACCC 
GCGTCGGGCA 
GAGCACCAGG 
CCACCTCCAA 
TGATTGACCT 
. GTAACTAACT 
ATAGGATGCG 
ATACAATACA 
TGGCTCATGC 
GGGAGTTGGA 
CTAGCCGGGC 
AATCGCTTGA 
GTCTGAGCAA 
ACATACATAC 
AAATAAATGG 
GCCGGTGGAT 
TCACAATACA 
CTCGGGAGGC 
AGCCGAGATC 
AATGAAAATG 
AGAAAAGAAA 
CTTCTCGCCT 
TCATTTTATG 
CTGGTTGCCC 
GGTCAGGGGT 
GTTTGACAGA 
CCCAGATGTA 
TTTCTTTCTT 
AGACAGAGTT 
CACCCTCCGC 
TACAGGCATG 
CATGTTGGTC 
CAAAGTGCTG 
CTCGCTCGCT 
CTTTCTTTCG 
GCTTTCCTGT 
TTCTTGCTTG 
TTCTTTTCTT 
TCGATTTCTT 
CTTCTTGCTT 
CTTGCTTGCT 
CCTGCTTGCT 
C TTTT GTTTC 
GC TTT CTTGT 
TGTTTTCTTG 
TTTCTTTTCT 
CCTTTCTTTC 
TCTTTCTGTT 
TGGCGCGATC 
CAGCCTCCCG 
TTGTGTTTTT 
GACCTCCTGT 
CGTGGCCGGC 
ATATGTATTA 
TGCATTGAAT 
ATGATAGATG 
TCGTCGATAA 
CCGAGCGTAC 
CTTTCTTCCT 
CTTTCTGTCC 
TTTCTTTCCT 



CCTCCCCAGG 
TCACCTAGCG 
CCGCCTGCGC 
GTTCCTCCCT 
TTGATCAGGC 
AAATAAAATA^ 
ATAGGATACG 
ATACAATACA 
CTGTCATCCC 
GACAAGCCCG 
GCGGTGGCAC 
ACCTGGGAAG 
CAAGAGCGAA 
ATACATACAT 
GCCCTGCGCG 
CAAGAGGCGG 
CAACATTAGC 
CGAGCTGAGG 
GCGCCACTGC 
AAAATGAAAC 
AAAGAAAAAA 
TTCGAGGCCT 
TTATTATCAT 
AGACAACCCC 
TTCCTTGTCT 
TGGCAGCTCC 
GTGAAAGCAG 
TCTTTCTTTA 
TCACTCTTGT 
CTCCCAGGTT 
GGCCACCGTG 
AGGCTGGTCT 
GGATGACAGG 
TGCTTGCTTG 
TTTCTTTCAT 
TTTCTTTCTT 
CTTGCTTGCT 
TCTTTCTTGC 
TCTTTCTTTT 
TCCTGTTTTC 
TGCTTTCGTG 
TTCTTGCTTG 
TTTCTTTCTT 
TTTCTTGCTT 
CTT TCTTGCT 
TTTTCTTTCT 
TTTCTTTCTT 
TCGTCCTTTT 
TTGGCTCACC 
ATTAGCGGGG 
AGTAGGCACG 
GATGCGCCCA. 
CTGTTGACTC 
ATGTAAACGT 
ACTCTTGCGT 
TAGGTGTCTG 
AGACGTTTAT 
GTAGTTATCT 
TTCTCTCCTT 
.TTTTTTCCTT- 
CTCTGTTTCT 



• GGGATCCCAA 
GTCACTGTTA 
AC AGC CTGGG 
- CCCTTGTTGC 
AAAAACGAAC 
'AGTCAATAOaC 
ATAGGATACA 
ATACAATACA 
GTCACTTTGG 
ACCAAGATGG 
ATGCCTATAA 
CGGAGGTTGC 
ACTCCGTCTC 
ACATACATAC 
GTGGCTCAAG 
TCAGACCAAC 
CGGGCGCTGT 
CAGGAGAATC 
AACCCAGCCT 
GCAACAAAAT 
ACAACAAAAC ' 
CAAACACGTT 
GATTGATGTT/ 
GGGAGACAGA 
TTCTTCGTGT 
ACTTTAGGCC 
GTAGATTGCC 
TTACTTTCTC 
TGCCCAGGCT 
CAAGCGATTC 
CTGGCTGATG 
CCCACTCCCA 
CGTG CAACCG 
CTTTCGTGCT 
GCTTGCTTTC 
TCTTTCTTTC 
TTCGTGCTTT 
TTGCTTTCCT 
GTTTCTTTCC 
TTTCTTTCTT 
CTGTCTTGTT . 
ATTGCTTTCG 
GCTTCCTTGT 
TCTTTCTTTT, 
TGCTTGCTTT 
TTCTTGCTTT 
TCTATCTTTC 
GAGACAGAGT 
GCACCTTCCG 
ATTGACAGGG 
CCGTGTCTCT 
CCTCGGCCTC 
ATTTCGCTTT 
TTCTGTACGC 
ATGGTAAATA 
TGTTATACAA ' 
TTTACGTATG 
CTGTTTTCTT 
CTTTAGGTTT 
CGTGCTTTAT; 
TTTTCCCTTC 



37920 
37980 
38040 
38100 
38160 
38220 
38280 
38340 
38400 
38460 
38520 
38580 
38640 
38700 
38760 
38820 
38880 
3 8940 
39000 
39060 
39120 
39180 
39240^ 
39300 
39360 
39420 
39480 
39540 
39600 
39660 
39720 
39780 
39840 
39900 
39960 
40020 
40080 
40140 
40200 
40260 
40320 
4 0380 
- 40440 
40500 
40560 
40620 
40680 
40740 
40800 
40860 
40920 
40980 
41040 
41100 
41160 
41220 
41280 
41340 
41400 
41460" "r 
41520 
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TTTCCTTCGT 
GTCTTTTAAA 
GTCTCTCTTT 
CTCCTTCCCT 
TGGATTCCGG 
GAGTCCTTGT 
TCCGAGAGGC 
GCAGAGACGC 
CCTCGGGAAG 
GCCGGATCTG 
AGATCGTGCT 
GCTGTGAGCT 
GCCAGAGCTG 
AGAGGCTGTC 
CAGCGCGCCG 
AGGCTGGGGA 
CGGCCGAATT 
CGGGTGCCGG 
CGAGTCCCCG 
TGGCCGGCCT 
CGGCTCCCGC 
TCCCCGGCCC 
TCCGTGTGTG 
TCGCCTGGGC 
TTCGCTCCGA 



TTCTTTCCTC 
AAATTGGAGT 
TCTCCATTTT 
TTCGCCATCT 
AAGAGCCTAC 
GTGTTCTTTC 
ATCTCCAGAG 
GTTTTGGGCA 
AGCTTCTCGA 
TCTCGCTGAC 
CTCGGCTTCC 
AGGCAGAGCT 
TGGCCGGTCG 
GCTGCGCTTC 
TAGCTCCCGA 
CGCCCTTCCC 
CGTTTCCGAG 
GGAGCGGTCC 
TGGCGAGTCG 
TCGGTCCCTC 
TCTGGAGACA 
GGCGCTGTCC 
GCTGCGATGG 
CGGCGGCGTG 
GTCGGCAATT 



ATTCTTTCTC 
GTTTCAGAAG 
CTTCCTCCCT 
GT CTCTTTTC 
CGATTCTGCC 
TCCCTCCCTC 
ACCGCGCCGT 
CCGTTTGTGT 
CTCACGGTTT 
GTCCGCGGCG 
GGAGCTGCGG 
CCGGAAAGCC 
CTTGTGAGTC 
TGGGCCCGCG 
GGCCCGAGCC 
GGCCCGGTCG 
ATCCCCGTGG 
CCGGGCCGGG 
GAGAGCGCTC 
GTGTGTCCCG 
CGGGCCGGCC 
CCGCGTGTGT 
TGGCGTTTTT 
GTCGGTGACG 
TTGGGCCGCC 



TCTTTTTCGT 
TTTACTTTGT 
CCCTCCCTCC 
CCCACTCCCC 
TCTCCGTGTG 
CCTCCCTCCC 
GGGTTGTCTT 
GGGGTTGGGG 
CGCTTTCGCG 
GTTGTCGGGC 
TGGCAGCTGC 
CGCGGTCGTC 
ACAGCTCTGG 
GCGGGCGTGG 
GCGACCCGGC 
CGGTCCGCTC 
GGAGCCGGGG 
CCGCGGTCCC 
CCTGAGCCGG 
GTCGTAGGAG 
CCTGCGTGTG 
CCTTGGGTTG 
GGGGACAGGT 
CGACCTCCGG 
GGGTTATAT 



TGTTTCTTTC 

GTATCTACGT 

CTCCCTGCTC 

TCCCCCCGTC 

TCTGCAGCGA 

TCCCTCCCTC 

CTGACTCTGT 

CAGAGGGGCT^ 

GTCCACGGGC 

TCCATCTGGC 

CGAGGGAGGG 

AGCCCGGCTG 

CGTGCAGGTT 

GGCTGCCCGG 

GGACCCGCCG 

ATCCTGGCCG 

ACCGTCCCGC 

TCTGCCGCGA 

TGCGGCCCGA 

GGGCCGGCCG 

GCCAGGGCGG 

ACCAGAGGGA 

GTCCGTGTCC 

GCCCCGGGGG 



CTTCCCGTCT 
TTTCTAAATT 
CCTTCCCTCC 
TGTCTCTGCG 
CCCCGCGACC 
CCTCCCTGCT 
CGCGGTCGAG 
GCGTTTTCGG 
CGCCCTGCCA 
GGCCGCTTTG 
GACCGTCCCC 
GCCCGGTGGC 
TATGTGGGGG 
GCCGGTCGAC 
CGCGTGGCGG 
TCTGAGGCGG 
CCCCGTCCCC 
TCCTTTCTGG 
GAGGTCGCGC 
AAAATGCTTC 
CCGGGAGGGC 
CCCCGGGCGC 
GTGTCGCGCG 
AGGTATATCT 



41580 
41640 
41700 
41760 
41820 
41880 
41940 
42000 
42060 
42120 
42180 
42240 
42300 
42360 
42420 
42480 
42540 
42600 
42660 
42720 
42780 
42840 
42900 
42960 
42999 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

i 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

CTCCCGCGCG GCCCCCGTGT TCGCCGTTCC CGTGGCGCGG ACAATGCGGT TGTGCGTCCA 
CGTGTGCGTG TCCGTGCAGT GCCGTTGTGG AGTGCCTCGC TCTCCTCCTC CTCCCCGGCA 
GCGTTCCCAC GGTTGGGGAC CACCGGTGAC CTCGCCCTCT TCGGGCCTGG ATCCG 

(2) INFORMATION FOR SEQ ID NO:19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 755 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



60 
120 
175 



(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) . ORIGINAL SOURCE : 



wo 97/40183 

. PCT/US97/059H 



(xi) SEQUENCE DESCRIPTION: SEQ ID N6:19; 



GCGGCGTTGG TAGTCTCcS SSJ?^ JJ^JSJ??^ GCTCGAGGGT GGCGGTGG^g: 12S 
, TTCGGGGCCG GCGTTGCTTG GCTTACgS? r^S52SI^ TTGGGGGGGG TGCCGTCGTT 180 

- -gggggtgtga-ttcccgccgS -t??S§c?^" r^^J^-^^°<=cTc -aggagtcgS - ^ ilo- 

GTTCGTGTCT CGGGAGCGGT GCTlT-rm.^^ HSIS^^*^^^ CTTTGCCTCG GGTTTGCTTG Itr^ 

. cgggggacgt tc??ct?§?5 ??^cgcc S^SSSI GGaS????? 11° 

TCCCCTTCCC CGTTTCGCCG TcSS^o^rn ??I«=<3TTTT CGTTTCGGGC TGTGTTCGTT lln 

GCCCGGCCGT GcJSeSSa? ^J^SSS SSS^f GGCCCTCTCC CCGGtcSS tlo 

GGCGGCCACT GTGGTCCGGG AGC^t^SA CCCGGGCACG CACGCGTCCG tin 

TGCCCCCGCG GGC???§|?S ^Sn^IS AGGgScctS loo 

GAAGGCTGCG CACGTTGtCG GT?™?!? II^aIT™ GGGGGGGCCT GTGCGTOCGG teo 
GTCCTTCGTC GTCCCGTcS ?SS5rc ■ 7^0 ■ 
■ . ; ■ . '■ 755. 
. (2) INFORMATION FOR SEQ ID NO: 20: 

" V ^ , ' . (i) SEQUENCE CHARACTERISTICS- : / '. _ 

(A LENGTH: 463 base pairs" 

(fi) .TYPE: nucleic acid ; ' ■ 'i.-.. . 

(C -STRANDEDNESS: single ... . : 

. (D) TOPOLOGY: linear . . - " 

(ii) MOLECULE TYPE: Genomic DNA • V: ■ 

(11a) HYPOTHETICAL: NO • . 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE:. 
(Vi) ORIGINAL SOURCE : 

(Xi) SEQUENCE DESCRIPTION: SEQ IDv NO: 20: - > ^ _ 

TTCTGGTTGT TGGCGGCGGG GGCTCCal^ TGAGATAACC CCGAGCGTGT 120 

CAGGTCAGCC TCCGCCTGTG gSSJJctJS ^rnl^^"" CCTCCCCCTC TCCCCGAGGC ■ isS 
AGCGAGCCCG TCCGTTCGAC CTtSS^^G cn^rnl^^'X CCCCCCTCAC GTCCCTCGCG 240 • 
CCCCGGGGTT TTCACGGCGC CC??SJSS rr^^S^SS^^ ATCTTTCCGC GCTCCGTTGG 300 

CTGGTTCCGG TCTCCCCGCC aSJSc^ SSE^S.^^S^^ TCCGCCCGTG GTTTGGACGC 3S0 

CGGGTCTCCC AACCCCCgS? ^SSgSSS JSgSSS?? SS^^^''^'^^ GCTTGCTCTT 420 

(2) INFORMATION FOR SEQ ID NO:21: 

l±y SEQtJKNCE CHARACTERISTICS- \^ .L'.: . . v ! ^ 

^ i^^ra:. 378 base paiS V 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: sinale 
. (D) TOPOLOGY: linear 

(iir MOLECULE TYPE: Genomic DNA 
(lii) HYPOTHETICAL- NO 

(iv) ANTISENSE: NO - 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

_ ^""^^ ■ ^f^^^^^^ESCRiPTip^^ ^ : ... . . , ^ : ^ 

SSSS S?S???3g ?SS??SS TOAOTOCT TTCOMCTCC COCCC«=a<« „ , 

CSGCGACmc CCaTTCOAAC JS?™^"^ 120 

cTooxa^c -.^cxS^^ssss: psssi? jsss ; • is;:; 
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GGCTACCACA TCCAAGGAAG GCAGCAGGCG CGCAAATTAC CCACTCCCGA CCCGGGGAGG 
TAGTGACGAA AAATAACAAT AGAGGACTCT TTCGAGGGCC TGTAATTGGA ATGAGTCCAC 
TTTAAATCCT TTAAGCAG . . : ■ ^ ■ r' 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: ^ - 

... (A) LENGTH: 378 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



300 
360 
378 



(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 



GATCCATTGG 
TATTAAAGTT 
GCCGCGAGGC 
TTAGCTGAGT 
AAGCAGGCCC 
CCTATTTTGT 
CCTTATTGCG 



AGGGCAAGTC TGGTGCCAGC 
GCTGCAGTTA AAAAGCTCGT 
GAGTCACCGC CCGTCCCCGC 
TGTCCCGCGG GGCCCGAAGC 
GAGCCGCCTG GATACCGCCA 
TTGGTTTTCG GAACTGAGCC 
CCCCCCTA 



AGCCGCGGTA ATTCCAGCTC 
AGTTGGATCT TGGGAGCGGG 
CCCTTGCGTC TCGGCGCCCC 
GTTTAGTTTG AAAAAATTAG 
GCTAGGAAAT AATGGAATAG 
CATGATTAAG GGAAACGGCC 



CAATAGCGTA 
CGGGCGGTCC 
CTCGATGCTC 
AGTTGTTTCA 
GACCGCGGTT 
GGGGGCATTC 



60 
120 
180 
240 
300 
360 
378 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 719 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single . 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 



GGATCTTTCC 
TCCCCTCTCC 
GGGGGACCGC 
GCCGCGACCG 
GGCGCGTCTC 
TTTCGCGGAA 
CCCCGTCCGC 
GGTTTCTCTC 
GGACTGTCCT 
TCACGCCGCC 
GACCCGTCTT 
AAAGCCGCCG 



CGCTCCCCGT 
GGAGGGGGGG 
CCCCGGCCGG 
GCTACGAGAC 
AGGGCGCGCC 
TCCCGGGGCC 
CTCCCGGGCG 
TCTCCCGGTC 
CAGTGCGCCC 
CCCGACGAAG 
GAAACACGGA 
TGGCGCAATG 



TCCTCCCGGC 
GAGGTGGGGG 
CAAAAGGCCG 
GGCTGGGAAG 
GAACCACCTC 
GAGGGGAAGC 
GGCGTGGGGG 
TCGGCCGGTT 
CGGGCGTCGT 
CCGAGCGCAC 
GCAAGGAGTC 
AAGGTGAAGG 



CCCTCCACCC 
CGCGTGGGCG 
CCGCCGGGCG 
GCCCGACGGG 
ACCCCGAGTG 
CCGATACCCG 
TGGGGGCCGG 
TGGGGGGGGG 
CGCGCCGTCG 
GGGGTCGGCG 
TAACGCGTGC 
GCCCCGTCCG 



GCGCGTCTCC 
GGGTCGGGGG 
CACTTCAACC 
GAATGTGGCT 
TTACAGCCCT 
TCGCCGCGCT 
GCCGCCCCtC 
AGCCCGGTTG 
GGCCCGGGGG 
GCGATGTCGG 
GCGAGTCAGG 
GGGGCCCGAG 



CCCCTTCTTT 
TGGGGTCGGC 
GTAGCGGTGC 
CGGGGGGGGC 
CCGGCCGCGC 
TTTCCCCTCC 
CCACGCCCGT 
GGGGCGGGGC 
GTTCTCTCGG 
CTACCCACCC 
GGCTCGCACG 
GTGGGATCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
719 



(2 ) INFORMATION FOR SEQ ID NO : 24 ; 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 685 base pairs 
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TYPE: nucleic acid 
(C STRANDEDNESS : single 
(D) TOPOLOGY: linear 

M f - , "2^^^ '^P= = Genomic dna 
(111) HYPOTHETICAL- NO 

(IV-)^ANTISENSE-: NO^ ' 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:24r 



S?SSS ^gf?^ TCTCGCCCGC CGCGTCGGGG 

^^2'='=^^ «3AAACTCTG GtSaGOT?? SS^^^^ TGAACTATGC CTGGGCAGGG 
CGACCTGGGT ATAGGGGCGA AAGACTaJt? cla^ofS^S ^GACCTGCAA ATCGGTCGTC 
TTTCCCTCAG GATAGCTGGC GCTCTCrrIf GTAGCTGGTT CCCTCCGAAG 

CGGAATGGAT TAGGAGgIct tSSSS^ CAGTTTTATC CGGGTAAAGG 

ATGGGTAAGG AAGCCCGGCT Ccg^^S^CTG ^^^^5^^ AACTATTTCT CAAACTTTAA 
, GGCCACTTTT GGTAAGCAGA SSIfr^ ^S^^^^^^G TGGAATGCGA GTGCCTAGTG 
^ CCGATGCCGA CGCTCaS^G JJ^^SgS^ CGAACGCCGG GTTaSScJc 

TGGC^TGGA AGTCGGaS? SS^JSJSJS ^Irl^l TGATATAGAC AGcJSSJJS 
GCCCTGAAAA TGGATGGCGC TGGAGGCTc? G^roJ^?^ CTCACCTGCC GAATCAACTA 
AACGGGACGG GACGGGAGCG GCCGC GGCCCATACC CGGCCGTCGC CGGCAGTCGG 

. . (2) INFORMATION FOR SEQ JD NO:25: 

(i) SEQUENCE CHARACTERISTICS- 

= il^GTH: 33 base pairs 
. (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

^E: Genomic DNA 
. (Ill) HYPOTHETICAL: NO 
(IV) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
GAGGAATTCC CCTATCCCTA ATCCAGATTG GTG 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS- ■ ' 
^itl 35 base pairs ' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : sincile 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
(111) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE • 

_ -(vi) -ORIGINAL SOURCE: — 

(Xi) SEQtreNCE DESCRIPTION: SEQ ID NO: 26: . 
^AAACTGCAGG CCGAGCCACC TCTCTTCTGT GTTTG - " ^ " ' ^ " ^ " 



60 
120 
180 . 
240 
300 
360 

420 

480 

540 

600 

660 

685 



33 



35 
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(2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 
AGGAATTCAC AGAAGAGAGG TGGCTCGGCC TGC . 33 

(2) INFORMATION FOR SEQ ID 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ 
AGCCTGCAGG AAGTCATACC TGGGGAGGTG GCCC 34 
(2) INFORMATION FOR SEQ ID NO:29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO * 

(iv) ANTISENSE: NO . 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

AAACTGCAGG TTAATTAACC CTAACCCTAA CCCTAACCCT AACCCTAACC CTAACCCTAA 60 
CCCTAACCCT AACCCGGGAT 80 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



> NO: 28; 



ID NO: 28: 
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■ '"''"'^ ■ TOPOLOGY:- linear '-^"'^ ' ' ^' ■ ■'/^ 

( ii ) MOLECULE TYPE : Genomic DNa' ' " 
Uix) HYPOTHETICAL: NO '"'"^''^^ 
- ■ (IV) ANTISENSE: NO ' V - 

(v) FRAGMENT TYPE: / 
- - - --^^^)- ORZGZlill^-^SOX^CE i - ^ . .J , 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
TTGGGCCCTA GGCTTAAGG 

(2) INFORMATION FOR SEQ ID NO:31: 

, i±) SEQUENCE CHARACTERISTICS- ' 
\t\ ^t^' 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

H ^M^^^^ ^^^^ Genomic DNA^ 
(ill) HYPOTHETICAL- NO 

(iv) ANTISENSE: NO 

(v) : FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

^"^^ SEQUENCE DESCRIPTION:/ SEQ ID NO: 31: 
GCCAGGGTTT TCCCAGTCAC GACGT 

C2) INFORMATION FOR SEQ. ID NO: 32i ' ^ 

(i) SEQUENCE CHARACTERISTICS - 

(A) LENGTH: 26 base pairs " 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: sinale 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA ' 
: (ill) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: ^ 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO ^^^^^ 
GCTGCAAGGC GATTAAGTTG GGTAAC 

(2) INFORMATION FOR SEQ ID NO:33: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

. (if) -MOLECULE TYPE :~ Genomi:c DNflT ^ 

(ill) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

. ^ .Ay^l JPRIGINAL^SOURCE : - ~ ^ - - / 



19 



25 



26 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
TATGTTGTGT GGAATTGTGA GCGGAT 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOIiOGY: linear 

(ii) MOLECULE TYPE : . Genomic DNA . 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



GGGTTTAAAC AGATCTCTGC A 



21 
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■./■■■;WHATIS CLAIMED:;. ' ' 3^^^^^ 

"^^ producing an artificial chromosome, comprising: i 

a DNA fragment into a cell, wherein the DNA 

: ^:J^?9r[^^ comprises a selectable marker;. - . _„ _ _ _ _ _ 

growing the cell under selective conditions to produce cells 
that have incorporated the DNA fragment into their genomic DNA; and 

selecting a ceH that comprises a satellite artificial 
chromosome fSATAC], 

2. The method of claim 1 , wherein the DNA fragment is 

10 introduced into or adjacent to an amplifiable region of a chromosome in 
the cell. 

3. The method of claim 2, wherein the amplifiable region 
comprises rDNA. 

^- The method of claim 2, wherein the amplifiable region 
15 comprises heterochromatin. 

5. The method of claim 1 or claim 2, wherein the DNA is 
introduced into pericentric heterochromatin in a chromosome of the cell. 

6. The method of claim 1 , wherein the cell is a mammalian celi. 

7. The method of claim 1 or claim 2, further comprising, 
20 isolating the SATAC. 

8. The method of any of claims 1-7, wherein the DNA fragment 
cornprises a sequence of nucleotides that targets the fragment to the 
heterochromatic region of a chromosome. 

9. The method of claim 8, wherein the targeting sequence of 
25 nucleotides comprises satellite DNA. ' 

10. The method of any of claims 1-9, wherein the cell is a 
human cell. 

-1-1- Tbe„ method of any^of- claims 1-5-and -7- 9, wherein the cell is 
a fish, insect, reptile, amphibian, arachnid, or rodent cell. 



5> '%st?^^i&?^;:-i-.-is;5^^^ 
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1 2. A SATAC produced by the method of any of claims 1-11. 

13. An isolated substantially pure satellite artificial chromosome 
[SATAC]. 

14. The SATAC of claim 13 that is a megachromosome, 
5 comprising about 50 to about 450 megabases [Mb] . 

15. The SATAC of claim 13, comprising about 250 to about 
400 Mb. 

16. The SATAC of claim 13, comprising about 150 to about 
200 Mb. 

10 17. The SATAC of claim 13, comprising about 90 to about: 

120 Mb. 

18. The SATAC of claim 13, comprising about 1 5 to about 
60 Mb. 

19. A cell containing an artificial chromosome, wherein the 

15 artificial chromosome is produced by the method of any of claims 1-1 1 . 

20. A cell containing the SATAC of any of claims 12-19. . 

21. The cell of claim 19 or claim 20 that is a mammalian cell. 

22. The method of any of claims 1-1 1, wherein the SATAC is a 
megachromosome, and the method further comprises: 

20 introducing a fragmentation vector, whereby the 

megachromosomes in the cells are reduced in size, 

and identifying cells that contain SATACs that are about 1 5 to 
about 60 Mb. 

23. The method of any of claims 1-11, wherein the SATAC is a 
25 megachromosome, and the method further comprises, exposing the ceils 

to conditions, whereby cells that contain truncated megachromosomes 
are produced. 

24. The method of claim 23, wherein th conditions are selected 
from among exposure to X-rays, growth in the pr sence of an agent that 

30 d stabilizes bas pairing in the chromosome. 
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' > ^ 7 25. The method of claim 24rwhereih the agent Is ; 
brombdeoxyuridine. 

26. The method of any of claims 22-25, further comprising 
selecting a cell that- comprises-a satellite artificial chromosomelSATAG] 

5 that comprises about 1 5 to about 60 Mb. 

27. A cell containing an artificial chromosome, wherein the 
artificial chromosome is produced by the method of any of daims 22 25. 

28. The cell of any of claims 19-21, 25-27. wherein the artificial 
chromosome is a SATAC comprising about 10 to about 60 Mb. 

10 29. An isolated substantially pure satellite artificial chromosome 

[SATAC] of claim 1 3 that comprises about 1 0 to about 60 Mb. 

30. The method of any ofciaims 1-1 1 and 22-26, further 
comprising isolating the SATAC from the cell. 

31. The method of claim 30, wherein isolation is effected by: 
''^ isolating metaphase chromosomes; 

distinguishing SATACs from endogenous chromosomes; and : 
separating the SATACs from endogenous chromosomes. 

32. The method of claim 31, wherein: 

the SATACs are distinguished from endogenous chromosomes by 
20 staining the chromosomes with DNA sequence-specific dyes; and 

separation is effected by flow cell sorter. ^ 

A method for producing an artificial chromosome, 

comprising: 

introducing a DNA fragment into a cell, wherein the DNA 

25 fragment comprises a selectable marker, 

growing the cell under selective conditions to produce cells 

that have incorporated the DNA fragment into their genomic DNA, 

- - - -selecting-from among those cells, a cell that cbrh^^^^ 

/70V0 centromere. 



wo 97/40183 



FCT/US97/05911 



^ ' -233- • 

34. The method of claim 33, further comprising isolating that cell 
with the chromosome that comprises the de novo centromere, and 
growing the cell under conditions whereby a cell with a sausage 
chromosome is produced. 
5 35. The method of claim 34, further comprising isolating the cell 

with the sausage chromosome; and growing the cell under conditions 
whereby a first SATAC is produced. 

36. The method of claim 35, wherein the DNA fragment is 
introduced into or adjacent to an amplifiable region of a chromosome in 

10 the cell. 

37. The method of claim 36, wherein the amplifiable region 
comprises rDNA. 

38. The method of claim 36, wherein the amplifiable region 
comprises heterochromatin.' 

15 39 The method of claim 35 or claim 36, wherein the ONA is 

introduced into pericentric heterochromatin in a chromosome of the cell. 

40. The method of any of claims 33-39, further comprising: 
introducing a fragmentation vector that is targeted to the first 

SATAC; growing the cells; and selecting a cell that comprises a second 
20 SATAC, wherein the second SATAC is smaller than the first SATAC. 

41. The method of claim 40, wherein the selected cell has a 
dicentric chromosome comprising the de novo centromere. 

42. The method of claim 40, wherein the selected cell has a 
formerly dicentric chromosome and a minichromosome comprising the de 

25 A70v^o centromere. 

43. The method of claim 40, wherein the selected cell has a 
formerly dicentric chromosome. 
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44. A method for producing an artificial chromosome, comprising 
mtroducing a DN A fragment into a cell, wherein the DNA 
fragment comprises a selectable marker; 

^ ■ 9^°^*n9 the cell unde? seTectivt conditions to producVcells 

5 that have incorporated the DNA fragment into their genomic DNA; 

selecting from among those cells a cell that has produced a 
dicentric chromosome; and • 

growing that cen under selective conditions/ whereby a cell 
that contains a chromosome comprising a heterochromatic arm is 
10 produced. 

45. The method of claim 44, further comprising selecting the cell 
w.th the chromosome comprising the heterochromatic arm and growing it 
.n the presence of an agent that destabilizes the chromosome. 

46. The method of claim 45, further comprising identifying cells 
15 that contain a heterochromatic chromosome that is about 50 to about 

400 Mb. 

47. The method of any of claims 44-46, wherein the DNA 
fragment is introduced into or adjacent to an amplifiable region of a 
chromosome in the cell. 

20 48. The method of claim 47, wherein the amplifiable region 

comprises rONA. 

49. The method of claim 47, wherein the amplifiable region 
comprises heterochromatin. 

50. The method of claim 47, wherein the DNA is introduced into 
25 pericentric heterochromatin in a chromosome of the cell. 

51. A method for producing a transgenic (non-human) animal, 

comprising introducing a satellite artificial chromosom [SATAC] into an 
embryonic cell, 7 

52. The method of claim 51, wherein the embryonic cell is a 

■30 /-stennrceiirv': '^ - --.-^ . ..^ ^ ......^ . ... 
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53. The method of claim 51, wherein the embryonic cell is in an 
embryo. ~ 

54. The method of any of claims 51-53, wherein the SATAC 
comprises heterologous DNA that encodes a gene product. 

5 ,55. The method of any of claims 51-54, wherein the SATAC 

comprises heterologous DNA that encodes a therapeutic product. 

56. The method of claim 55, wherein the anti-HIV ribozyme is an 
anti-gray ribozyme, and the tumor suppressor gene is p53. 

57. The method of claim 54, wherein the product comprises an 
10 antigen that upon expression induces a immunoprotective response 

against a pathogen in the transgenic (non-human) animal. 

58. The method of claim 54, wherein the product comprises a 
plurality of antigens that upon expression induce an immunoprotective 
response against a plurality of pathogens. 

15 59. The method of any of claims 51-58, wherein the transgenic 

(non-human) animal is a fish, insect, reptile, amphibian, arachnid or 
mammal. 

60. The method of any of claims 51-59, wherein the SATAC is 
introduced by cell fusion, lipid-mediated transfection by a carrier system, 

20 microinjection, microcell fusion, electroporation, microprojectile, nuclear 
transfer, bombardment or direct DNA transfer. 

61. A transgenic (non-human) animal produced by the method of 
any of claims 51-60. 

62. The transgenic animal that is a fish, insect, reptile, amphibian, 
25 arachnid, or mammal. 

63. A method of for producing a transgenic plant or animal, 
comprising: 

introducing a DNA fragment into a cell, wherein the DNA 
fragment comprises a selectable marker; 
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that have Incorporated the DNA fragment into their genomic DNA; and 

selecting a cell that comprises a minlchromosome that is 
-about-IG Mb-to aboat-50 Ml. that ch 
euchromatin; 

isolating the minlchromosome and Introducing it into a plant " - 



or anlmar cell. 
64. 



10 



The method of claim 63; wherein: after selecting the cell, 
DNA encoding a gene product or products Is Introduced Into the cell, and 
the^cell IS grown under selective conditions, whereby cells comprising 

m.n.chromosomes comprising the DNA encoding the gene product(s) 
are produced. 

■65. The method of claim 64, wherein- after selecting the 

SATirct' "'^"'^^ ^''-''V cells comprising 

SATACS that comprise the DMA encoding the gene product(s) are 
produced. 

66. A method for cloning a centromere from an animal or plant 
comprising: 

preparing a library of DNA fragments that comprise the 
genome of the plant or animal; 

^ Introducing each of the fragments into mammalian satellite . 

artificial chromosomes [SATACs], wherein: 

each^SATAC comprises a centromere from a different 
species from the selected plant or animal, and a selectable marker- 
introducing each of the SATACS Into the cells and growing 
the cells under selective conditions; 

f identifying cells that have a SATAC; and 7 



20 



25 
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selecting from among those cells any that have a SATAC . 
comprising a centromere that differs from the centromeres in the original 
SATAC. 

67. A cell line having the identifying characteristics of any of 

5 TF1004G19C5, 19C5xHa4, H1D3 and G3D5, which have been deposit d 
at the ECACC under Accession Nos. 96040926, 96040927, 96040929, 
and 96040928, respectively. 

68. A cell line, comprising a megachromosome that comprises 
; about 50-400 Mb. 

10 69. A cell line of claim 68, wherein the megachromosome 

comprises 250 to about 400 Mb. 

70. A cell line of claim 68, wherein the megachromosome 
comprises about 150 to about 200 Mb. 

71. A cell line of claim 68, wherein the megachromosorhe 
15 comprises about 90 to about 1 20 Mb, 

72. A cell line of claim 68, wherein the megachromosoipe 
comprises about 60 to about 100 Mb. 

73. A method for gene therapy, comprising: 

introducing a SATAC that comprises DNA therapeutic product into 
20 a target cell; and 

introducing the resulting target cells into a host animal. 

74. The method of claim 73, wherein the target cells are 
lymphocytes, stem cells, nerve cells, insect cells, chicken ceils or muscle 
cells. 

25 75. The method of claim 73, wherein the minichromosome is the 

minichromosome present in the cell line EC3/7C5. 

76. The method of claim 73, wherein the chromosome is the A 
neo-chromosome in the c II line KE1 2/4. 

77. The artificial chromosome of claim 76 that is between about 
30 20 Mb and about 200 Mb. 
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■IS;^ ^^'^^^^^ • ^^^^ <=»^^on^osome of claim 76 that is between about 

79. The artificial chromosome of claim 76 that is between about 

u_ — _ 20 Mb and about 200 Mb.-- - ^ ^ _ _ ^ . 

— 5 80: The artificial chromosome of claim 76 that is between about 
^ ., 1 Mb and about 1 5 Mb, 

i ; ; ®^ "^^^ method of claim 5 1 , wherein the animal is a mammal or 

^ ;^ a the SATAC includes proteins and regulatory 

; -^^V:, expression of genes in the milk of the animal or in the egg of 

10 the animal. ' . ^ 

The method of claim 81, wherein the animal is selected from 

^^Wi y is selected from 

y -^:; among -fowl;;.. 

- 15 : - 84. The method of claim 51 . wherein the SATAC includes DNA 
- : expression of human cell surface proteins, whereby 

V .^^^^^^ th^ animal express the human proteins and will not be 

rejected upon transplantation into a human. 

^^r^ comprising the DNA having the sequence set 

20 \ forth in SEQ ID NO: 13, 14 or 15. 

i!:- ' isolated DNA; comprising the DNA having the sequence set 

87. An isolated DNA fragment, comprising a sequence of 
nucleotides set forth in any of SEQ ID Nos. 18-24. 
25 ^; 88. A SATAC of claim 14, comprising a sequence of nucleotides 
set forth in any of SEQ ID Nos. 18-24. 

89. A SATAC of claim 13, comprising a sequence of nucleotides 

„ ^ set forth in any of SEQ ID Nos.- 18-24. " " ' ^ 

- 90. A SATAC of claim 13, comprising a sequence of nucleotides 

^94^ s t forth in any-of SEQ ID Nos. 18-24. - - - - u . _ - . 
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91 . A cellular production system, comprising a cell containing an 
artificial chromosome [AC], wherein the AC comprises multiple copies of 
a heterologous gene or a plurality of heterologous genes. 

92. The cellular production system of claim 91 , wherein the AC 
5 is a SATAC. 

93. The system of claim 91 or claim 92, wherein the 
heterologous genes encode proteins that comprise a metabolic pathway. 

94. A method of expression of a product that is produced upon 
expression of a metabolic pathway, comprising culturing the system of 

10 claim 93 under conditions whereby the proteins comprising the pathway 
are expressed to produce the product. 

95. The method of claim 94, wherein the product is a vitamin, a 
hormone, a nucleotide, an amino acid, a protein or a peptide. 

96. The method of claim 51 , wherein the animal is oviparous. 
15 97. The method of claim 51, wherein animal is a chicken, 

98. The method of claim 51 , wherein the animal is an insect. 

99. A method for producing a transgenic plant, comprising 
introducing a satellite artificial chromosome ISATAC] of any of claims 
13-18 or 88-90 into a plant cell; and oulturing the cell under conditions 

20 whereby a plant is generated. 

100. The method of claim 99, wherein the SATAC is introduced 
by protoplast fusion, microinjection, microcell fusion, lipid-mediated gen 
transfer, electroporation, microprojectile bombardment or direct DNA 
transfer. 

25 101. A method for producing a gene product{s), comprising 

introducing a satellite artificial chromosome [SATAC] of any of claims 13- 
18 or 88-90 into a cell; and culturing the cell under conditions whereby 
the gen product{s) is (are) expressed. 

102 The method of claim 102, wherein the gene product is 

30 produced by expression of a s ries of genes that encode proteins that 
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; ^ ^ comprise a metabolic patl^^ 
" genes. ■ v"'" "'"'-^ ' ' '-^^ 

r ■ ^" ^ ^ synthesized artificial rtiammalian cjiom^somi 

• . ___ (I^MAC), comprising a centromerera telemere^a megareplicator; and a 
5 selectable marker, wherein the centromere is derived from a SATAC of 
any of claims 12-18 and 88-90. 

1 04. An in vitro synthesized artificial mammalian chromosome 
(ISMAC), comprising a centromere, a telemere, a megareplicator, and a 
selectable marker, wherein the centromere is derived from a SATAC of 

10 any of claims 12-18, and 88-90 

105. The ISMAC of clairri 103 or claim 1 04, further comprising 
heterochromatin. 

106. The ISMAC of any of claims 103-105, wherein the 
megareplicator comprises rDNA. 

IB 107. The ISMAC of any of claims 103-10^: whe^in^^ 

centromere is a human centromere. 

108. ThelSMAC of any of claims 103-107, wherein the 
centromere is derived from a megachromosome. 

109. The ISMAC of any of claims 103, 105, 106 or 108, wherein 
20 the centromere is derived from a cell line having all of the identifying 

characteristics of the cell line deposited under at the European Collection 
of Animal Cell Culture (EGACC) under Accession No/ 96040929: 

110. The method of claim 54, wherein the product is a hormone, 
antibody, cytokine, growth factor, regulatory protein, secretable proteins. 

25 111. The method of claim 54, wherein the product is the cystic 

fibrosis transmembrane regulatory protein [CFTR], an anti-HIV ribozyme, 
or a tumor suppressor gene. 

112.- The method of claim 66, wherein the ahim^^^^^^^ 

113. A method of producing an in vitro synthesized artificial 
m ^ mammalian chromosome (ISMAC) comprising, cornbi^^^^^^^^ 
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telomere, a megareplicator, and a selectable marker to produce a ^ 
replicable ISMAC, herein the centromere is derived from a SATAG of any 
_ of claims 1 2-1 8, and 88-90. • . ' - 

114. The method of claim 1 1 3, further comprising including rDNA 
5 inthelSMAC. 

115. The method of claim 1 13, wherein the telomere comprises a 
plurality of repeats of SEQ ID No. 29: 

1 16. The method of claim 115, wherein the telomere is about 1 
kB up about 1 Mb, preferably about 1 kB up to about 500 kB. 

10 1 17. The ISMAC of claim 103-108, wherein the telomere. ; 

comprises a plurality of repeats of SEQ ID No. 29. 

1 18. The ISMAC of claim 1 17, wherein the telomere is about 1 kB 
up to about 1 Mb, preferably about 1 kB up to about 500 kB. 

119. A method for producing an artificial chromosome, 
15 comprising: 

introducing a DNA fragment into a cell, wherein the DNA 

fragment comprises a selectable marker; 

growing the cell under selective conditions to produce cells 

that have incorporated the DNA fragment into their genomic DNA, 
20 wherein the DNA fragment is introduced into or adjacent to an amplifiable 

region of a chromosome in the cell, whereby a minichromosome 

comprising thei DNA derived from the amplifiable region is produced, 

wherein the minichromosome is an artificial chromosome that contains 

more euchromatin than heterochromatin. 
25 120. The method of claim 1 19, wherein the amplifiable region is 

rDNA. 

121 . The method of claim 1 1 9 or 1 20, further comprising isolating 
the minichromosome. 

122. A minichromosome produced by the methods of any of 
30 claims 119-121. 
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encoding a gene product, wherein: ^i' y^hi^ . 

the DNA encoding the gene is on the fragment that comprises the 
selectable^rnarker OMS p 

SATAC comprises the heterologous DNA that encodes a gene product. 



wo 97/40183 



1/5 



PCTAJS97/05911 



CHROMAUDS 
TaO MERE ^CTOgi ORE 



CENTROMERE 




'TaOMERE 
A 



TRANSFECTIDN 
XCMB AND XgtWESneo DNA INTEGRATION 



EUCHROMATIN HETEROCHROMATIN 
(SATEUIIE DNA) 

MOUSE CHROMOSOME 
(UyflK'CELL ONE) 



B 



^''■•:•V'/:^::v^::^^•^-':"'^^(i 



AMPLICATION 
CENTROM^E FORMATION 




SELECTION (G-418) 




MITOTIC SPINDLE- 

L*> —'A"-' - ' • • 




TELDMERIC DNA^ KINETOCHORE 
SATELLITE DNA 



CHROMOSOME 
BREAKAGE 



DICENTRIC CHROMOSOME 
(EC3/7 CELL UN^ 



»FORQGN"DNA 



SATELLITE DNA 



FORMERLY DICENTRIC 

CHROMOSOME y-S-^:.:-^,<t 




leo-CENTROMERE 
TELOMERE 



CHROMOSOME FRAGMENT 
^ ..r^.. nao-CENTROMERE 
^ 10-15Mb (EC3/7 CELL LINE) 



OUPLICAHON- 
SINGLE CELL CLONING 



TELOMERE/mCPE 1.51 

SATELLITE DNA 
•*FOREIGN"Di 




CENTROMERE 
TELOMERE 



SATELLin DNA 



20-30Mb 

^ neo-MINICHROMOSOME 
(EC3/7-C5:EC3/7-C6 CELL LINES) 



FIG. I 



SUBSTITUTE SHEET (RULE 26) 



wo 97/40183 



PCT/US97/05911 



2/5 



TRANSFECnON INTEBRAHON 

OF FOREIGN DNA 
(\CM8 AND XgtWESneo) 



CHROMOSOME #7 
(MOUSE LMTK CELL LINE) 



SATELilTE DNA' 



TORQGJ 




10-15Mb 



CHROMOSOME #7 WITH 
TRACES 0F*1^ROGN"0NA 

CHROMOSOME FRAGMENT 
^ WITH neo-CENTROMERE 

^ (EC3/7 CELL UNE) 



\/ , "FOREIGN "DNA " 

MOUSE (HlVrib-HYG.^-gal.» 
CHROMOSOME #7 ^ 




neo-CEMTROMERE 



V, _ CHROMOSOME £7 
(EC3/7 TRANSFORMED XaL 



DICENTRIC 

UNE) 
MironC SPINDLE 




■KINErOCHORES 
CHROMOSOME BREAKAGE 



SATELLITE DNA 



lERE? 
"FOREIGI^DNA 



(X.CM8 AND 




CENTROMERE 
MERE 



CTELLITE DNA 
J-KINErOCHORE 



gtWESneo) 20-3GMb 

, >^o-MINICHROMOSOME 
(EC3/7C5 AND EC3/7C6 CELL UNES) 



TELOMERE 
0 



a4 a2 a1 

"SAUSAGrCHROMOSOME 
~150-200Mb (TF1004G-19C5 CELL LINE) 



CELL FUSION- 
SELECTION 



TaOMERE 



neo-CENTROMERE 
SATELLITE DNA \mOMERE 



TOREIGfTDN? 
.CM8 AND ' 
gtWESneo) 




Aneo-CHROMOSOME (l50-200Mb) 
(KE 1-2/4 HYBRID CELL UNE) 



-> 1,000Mb 



1 



MOUSE , ^ 
CHROMOSOME §7^ 



■i^ssiemBam -i iSni ss^SiSici -s^t^, ; ^>k^^ ii-^^ ;^=j;'i see? 



a(N)a11 d0a9 aB a7 a6 aS a3 a2 a1 

GI6ACHR0M0S0ME WITH VARIABLE LENGTH OF 
HETEROCHROMATIC ARM _ 
(19C5xH<i47 HYBRID CELL UNE) 



EUCHROMATIC 
TELOMERE 



"POREIGN"DNA „ ./.uo/MiA-no 
(HIVnl)-HYG.^-gd.X) E^HROkWOlC 

CENTROMERE /T^i^^^ TELOMERE 



a?. "Jjy:^ i^^S iv<»-; .ggi-'.i -^t;^ jf^.^jyg 



a9aBa7a6a5 Ma3a2a1 

STABLE MEGACHR0MOSOME'>'250-4O0Mb 
(HID3 HYBRID COL UNE) 



EUCHROMATIN 

eg CONSTITUTIVE 

i HETER0CHR0MAT1N 

D INTEGRATED 
"FORHGN-DNA 

a-AMPUCONS 



FIG. 2 



SUBSTITUTE SHEET (RULE 26) 



wo 97/40183 



PCT/US97/05911 



3 / 5 



PRIMARY REPLICATION MHAnON SITE (MEGAREPUCATOR) 
ECONOARY ORIGINS OF REPLICATION 




MEGAREPUCON OF THE CEhOROMERIC REGION OF MOUSE 
CHROMOSOMES WITH TWO^ 7.5Mb TANDEM BLOCKS 
OF MOUSE MAJOR SATELLITE DNA (mSA-n FIANKED BY 
NON-SATELLIFE ONA SEQUENCES 



.INTEGRAHON OFTOREIGN-ONA (pH132. pCHllO,?) 



RE PLICAT ION ERROR GENERATES 
INVERTED MEGAREPUCONS 



HmSAT:;l 



» ••. • f./ •s* 'M , 

tmSAT i 



A TASm y 



TASm 



AMPLfflCATlON PRODUCES A TANDEM ARRAY OF IDENHCAL CHROMOSOME SEGMENTS (AMPUCONS) THAT 
CONTAIN TWO INVERTED MEGAREPUCONS BORDERED BY THE HETEROLOGOUS (TOREIGN^ DNA 





fl-....'...:*^. 


j TASm 


|TASm J [m^jSllmaCTrl 


TASm| 


TASm j 


1 mSAT 


|mSAT 




|T)iSm;:J J mSAT | 


mSAT |{ 


f0l 


•JASmcJ 



CHROMATIDS 



L 



~ 7.5Mb - 
TELOMERE 



>15Mb 



~30 Mb 
AMPLiCON 



CENTROMERE 



TELOMERE 















1 




1 




1 


d 


1 




\\m 


m 
















N 


i 


N 







STABLE MEGACHROMOSOME (~250-400Mb) 

FIG. 3 



SUBSTITUTE SHEET (RULE 26) 



wo 97/40183 



PCTAIS97/05911 



4/5 
EC3/7 

MOUSE lJylTK" (nBROBLASr CELL UNE WITH neo-CENTROMERE) 

SINGLE-CELL SUBCLOrONG I I FUSION WITH CHO K20 CELLS AND 

f tSELECnON WITH 6418 AND HAT 



EC3/7CS 

MOUSE LMTk HBROBLAST 
CELL UNE WITH THE neo- 
MMCHROMOSOME AND THE 
FORMERLY DICENTRIC 
CHROMOSOME 



KE1-2/4 

MOUSE-HAMSTER HYBRID 
CELL LINE WITH THE 
STABLE AneoH^IROMOSOME 



COTRANSFECnON WTIH PIASMIDS pH132 
(ANn-^HIV RIBOZYME AND HYGROMYON-RESBTANCE 
GENES), pCHIIO (iocZ GENE) AND >cl 875 Sam7 
( APHAGE). SELECTION WIIH HYGROMYON B 



TF1G04G-19C5 



MOUSE LMTK"FIBROBLAST CELL LINE WITH 
heo-MINICHROMSOME AND STABLE SAUSAGE CHROMOSOME 

FUSION WITH CHINESE HAMSTER OVARY CELLS 
(CHO K20 CELL LINE). SELECnON WITH HAT 
AND HYGROMYCIN B. 



1 



19C5xtte4, [Ti^ ^^^rSFsOME) 
MOUSE^HAMSTER HYBRID CELL LINES CARRYING THE 
neo-MINICHROMOSOME AND THE SAUSAGE CHROMOSOME AND 
COUNTAINING A COMPLETE HAMSTER GENOME AND PARTIAL 
MOUSE GENOME 
1 



BrdU TREATMENT, 
SINGLE-CELL 
CLONING, 
.SELECTION WITH 
HYGROMYCIN B 

I H1D3 

MOUSE-HAMSUR HYBRID 
CELL LINE CARRYING A 
MEGACHROMOSOME BUT NO 
MINICHROMOSOME 

FUSION WITH CD4+ 
HeLa CEL15 
CONTAINING neor. 
SELECnON WITH 
6418 AND 
HYGROMYCIN B 

H1xHe41 



BrdU TREATMENT, 
SINGLE-CELL 
CLONING. 
SELECnON WITH 
6418, BrdU 
TREATMENT AND 
RECLONING 

- . ■ \ 

G3D5 G3D6 
" I 
MOUSE-HAMSTER HYBRID CEIL LWES CARRYING: 
MEGACHROMOSOME 
AND neo^ 
MINICHROMOSOME 

RECLONE AND GROW 
IN 6418 AND 
HYGROMYCIN B 



uru 

i 



neo- MINICHROMOSOME ONLY 

RECLONE AND GROW 
IN G41B 



1^ 



CARRIES MEGA- 
CHROMOSOME AND 
neo-MMCHROMOSOME 



CARRIES heo- 
MMCHROMOSOME ONLY 



MOUSE-HAMSTER- 
HUMAN HYBRID 
CELL LINE CARRYING 
THE MEGACHROMOSOME 
AND A SINGLE HUMAN 
CHROMOSOME WITH CD4 AND 

neor GENES; CONTAINS COMPLETE HAMSTER AND PARTIAL MOUSE GENOMES 



SUBSTITUTE SHEET (RULE 26) 



SCJ-^ ^Z^^i ^-j^., 



a^- >fe 



wo 97/40183 



5/5 



PCT/US97/05911 




RECTIHB) SHEET (RULE 91^ 



REVISED 
VERSION* 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCX 

INTERNATIONAL APPLICATION PUBUSHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classiflcation ^ : 
C12N 15/90, 15/85 



A3 



(11) international Publicati n Number: WO 97/40183 

(43) InternaU nal Publication Date: 30 October 1997 (30.10.97) 



(21) International Application Number: PCT/US97/0591 1 

(22) International Filing Date: 10 April 1997 (10.04.97) 



(30) Pri rity Data: 
629,822 
682,080 
695,191 



10 April 1996 (10.04.96) US 
15 July 1996(15.07.96) US 
7 August 1996 (07.08.96) US 



(71) Applicants {for all designated States except US)i THE BI- 

OLOGICAL RESEARCH CENTER OF THE HUNGAR- 
IAN ACADEMY OF SCIENCES (HU/HU]; P.O, Box 
521, H-6701 Szeged (HU). LOMA LINDA UNIVERSITY 
[USAJS]; Loma Linda, CA 92350 (US). AMERICAN GENE 
THERAPY, INC. [CA/CA]; 5022 154th Street, Edmonton, 
Albeita T6H 5PE (CA). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): HADLAC23CY. Gyula 
{HU/HU]; Szamos U.I.A. DC. 36.. H.6723 Szeged (HU). 
SZALAY, Aladar, A. [US/US]; 7327 Fairwood, Highland, 
CA 92346 (US), 

(74) Agent: SEIDMAN, Stephanie, L^ Brown, Martin, Haller & 
McClain, 1660 Union Street, San Diego, CA 92101-2926 
V (US). 



(81) Designated States: AL. AM. AT, AU. AZ, BA, BB, BG, BR, 
BY. CA, CH, CN, CU. CZ. DE, DK, EE, ES, FI, GB, GE, 
GH, HU, IL, IS, JP, KE, KG. KP, KR, KZ, LC, LK, LR, 
LS, LT, LU, LV. MD, MG, MK, MN, MW. MX, NO, NZ, 
PL, PT, RO, RU, SD, SE, SG, SI, SK, TJ, TM, TR, TT. 
UA, UG, US, UZ, VN, YU, ARIPO patent (GH, KE, LS, 
MW, SD, SZ, UG), Eurasian patent (AM, AZ, BY, KG, KZ, 
MD. RU, TJ, TM). European patent (AT, BE, CH, DE, DK, 
ES, FI, FR, GB, GR, IE, FT, LU. MC, NL, PT, SE), OAPI 
patent (BF, BJ. CF, CG, CI, CM, GA, GN, ML, MR, NE, 
SN. TD, TG). 



Published 

f^ith a revised version of the international search report. 
Before the expiration of the time limit for amending the claims 
and to be republish&i in the event of the receipt of amendments. 

(88^ Date of publication of the intemational searcb report 

5 February 1998 (05.02.98) 

(88> Date of publication of the revised version of the intemational 
search report: 30 A^ni 1998 (30.04.98) 



(54) Title: ARTIHCIAL CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES 



(57) Abstract 

Mediods for preparing cell lines that contain artificial chromosomes, methods for preparation of artificial chromosomes, methods for 
purification of artificial chromosomes, methods for targeted insertion of heterologous DNA into artificial chromosomes, and methods f r 
delivery of the chromosomes to selected cells and tissues arc provided. Also provided arc cell lines for use in the methods, and cell lines 
and chromosomes produced by the methods. In particular, satellite artificial chromosomes that, except for inserted heterologous DNA, 
are substantially composed of heterochromatin, arc provided. Methods for use of the artificial chromosomes, including for' gene therapy, 
production of gene products and production of transgenic plants and animals are also provided. 



'(Refencd to in PCT Gazette Na 17/1998, Section II) 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL Albania 
AM Annenta 
AT Aimria 
AU Australia 
■AZ Azerbaijan 
BA Bosnia and Henegovina 
BB Barbados 
BE Belgium 

BF Burkina Faso 

BG Bulgaria 

BJ Benin 

BR Brazil 

BY Belanis 

CA Canada 

CF Central African Republic 

CG Congo 

CH Switzerland 

CI CO(e d'lvoiie 

CM Cameroon 

CN China 

CU Cuba 

CZ Czech Reimblic 

DE Germany _ 

DK ' Denmark 

EE Estonia 



BS 
FI 
FR 
6A 
GB 
GB 
GH 
. GN 
GR 
HU 
IE 
IL 
IS 
IT 
JP 
KE 
KG 
KP 

KR 
KZ 
LC 

LK 
LR 



Spain 
Finland 
ftance 
Gabon 

United Kingdom 

Georgia 

Ghana 

Guinea 

Greece 

Hungary 

Ireland 

Israel 

Iceland 

Italy 

Japan 

Kenya 

Kyigyzstan 
Democratic People's 
Republic of Korea 
Republic of Korea 
Kazakstan 
Saim Lucia 

Ivtcchtenstein ^ ^ 

Sri Lanka 
Liberia 



LS 


Lesotho 


SI 


LT 


Lithuania 


SK 


LU 


Luxembourg 


SN 


LV 


Latvia 


SZ 


MC 


Monaco 


TD 


MD 


Republk; of Moldova 


TG 


MG 


Madagascar 


TJ 


MK 


The former Yugoslav 


TM 




Republic of Macedonia 


TR 


ML 


Mali 


TT 


MN 


Mongolia 


UA 


MR 


Mauritania 


UG 


MW 


Malawi 


US 


MX 


Mexico 


UZ 


NE 


Niger 


VN 


NL 


Netherlands 


YU 


NO 


Norway 


ZW 


NZ 


, New Zealand 




PL 


Poland 




FT 


Portugal 




RO 


Romania 




RU 


Russian Federation 




SD 


Sudan ' — -7 




SE 


Sweden 




SG 


Singapore 





Sk>venia 

Slovakia 

Senegal 

Swaziland 

Chad 

Togo 

Tajikistan 

TWkmenistan 

Turkey 

Trinidad and Tobago 

Ukraine 

Uganda ' 

United States of America 

Uzbekistan 

Viet Nam 

Yugoslavia 

Zimbabwe 



REVISED 
VERSION 



TIONAL SEARCH REPORT 



' Appllcatlon No 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC 6 C12N15/9Q C12N15/85 




Aooording to bitenwtional Patent Clnstfioaten (IPC) or to both national olassifieation and IPC 




B. HELOS SEARCHED 


Mmtmum dooumantation aoarohad (olaaaifioaAnn systam followad by olaaaifioatton symbola) 

IPC 6 C12N 


Dooumantotion Marohed othor than minimum dooumontation to tho extant that stjch dooumante am 


inohidad in tho fialds aaarohad 



Elaotronto data baaa oonauttad duhng tha intamat»naJ —anh (nama of data basa and, whera practnal* saaich temna uaad) 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Catagory * Cftotion of dooumant with incfiootion, wham appropriata, of tha ralavant pa a a a g ea 



Ralavant to datm No. 



G. HADLACZKY ET AL. : "Centromere 
formation in mouse cells cotransformed 
with human ONA and a dominant marker gene' 
PROCEEDINGS OF THE NATIONAL ACADEMY OF 
SCIENCES, 

vol. 88, 1991. WASHINGTON. DC, US, 
pages 8106-8110, XP002037028 
cited in the application 
see the whole document 



■/-- 



1-50,75, 
76 



Furthar dooumanta are liatad in tha oontinuation of tMX C. 



0 



Patent family mambera are listed in annex. 



* Special oatagorias of oited dooumenta : 

*A* document defining the general state of the ert whnh is not 

oonatdered to be of paitieular relevanoe 
*E* earlier dooument but put)liahed on or after the intemationai 

fiUngdate 

"L* document which may throw doutita on priority olaim(8) or 
. which is cited to establiah the publication date of another 
citation or other special reaaon (aa specified) 
*0* document referring to an oral disotesura, use, exhbition or 



*P* document published prior to the international filing date but 
later tttan the priority date olaimed 



-T* later dooument published after the intemationai fifing date 
or priority dote and not in oonfycl with the appfiosrtion but 
ested to understand the principle or theory underiying the 
invention 

•X" document of particular relevanoe; the claimed invention 
cannot t»e considered novel or oarmot t>e considefed to 
invotve an inventive step when the document is taken alone 

•Y* document of particular relevanoe; the olaimed invention 

cannot be considered to involve an inventive step when tfte 
dooument is comtMned with one or mora other such docu- 
ments, such comtMnation being obvious to a person stalled 
in the art 

'V dooumerrt member of the some patent family 



Date of the actual comptotion <rf the intemationai search 



24 February 1998 



Data of maifing of the international search rapoit 

.1 0. 03. 98 



Name and mailing address of the ISA 

European Patent Offiee, P.B. 5818 Potentlaan 2 
NL - 2280 HV R^swtjk 
Tel. (<*-31-70) 340-2040, Tx. 31 651 epo nl. 
Far. (-^31-70) 340-3016 



Authorized officer 



Mateo Rosel 1 , A.M. 



Fofm PCT/tSA/210 (Mssnd stMol) (Juty 1992) 



page 1 of 4 




^ONAL SEARCH REPORT 



bik dAppHcatienNe 

PCT/US 97/05911 



C^Contlfuiailon) DOCUMENTS CONSIDEREO TO BE RELEVANT 


Category • 


Citation of dooumentt with incSoabon, whara appropriate, of tha mlavant poaaagaa 


Hatavant to olaim No. 


X 


EP 0 532 850 A (BIOLOGICAL RESEARCH CENTER - ft ^ 
OF THE HUNGARIAN ACADEMY OF SCIENCES) 17 

M^v*^h 100*3 

see the whole document, specially exarrple 

6 ; " . ■ - 


33-50, 
75,76 

1-32 


A 


& US 5 288 625 A 
^ cjted in the appl ication _ ^ _ _ ^ ^ 


^ 63-65 


Y 


K. FATYOL ET AL,: "Cloning and molecular 
characterization of a novel chromosome 
specific centromere sequence of Chinese 
. hamster" 

. NUCLEIC ACIDS RESEARCH, 
vol. 22. no. 18. 1994, OXFORD. GB, 
pages 3728-3736, XP002037O27 
cited in the application 
see the whole document 


1-32 


■ V ■ 
A 


T. PRAZNOvSZKY ET AL. : De novo 
chromosome formation in rodent cells" 
PROCEEDINGS OF THE NATIONAL ACADEMY OF 
SCIENCES, 

vol. 88, 1991. WASHINGTON, DC. US, 
pages 11042-11046. XP002037029 
see the whole document 


33-50, 
75,76 


A 


r. KAYNAL tT AL. : Complete nucleotide 
sequence of mouse 18S rRNA gene : 
comparison with other available homologs" 
FEBS LETTERS. 

vol. 167. 1984. AMSTERDAM, NL. . 

pages 263-268, XP002037024 

see the whole document, specially figure 2 


O/-90, 
99-101. 
103-108. 
113 


Y 
A 


K.M. lUKL^YNbKi cT AL. : Cloning and 
sequencing of a human 18S ribosomal RNA 
gene" 
ONA. 

vol.. 4. no. .4, 1985. NEW YORK. NY.US, 

pages 283-291, XPO02037025 

see the whole document, specially figure 3 


o/-yo. ■ 

99-108. 

113 


X 


S. CROSS ET AL.: "The structure of a 

subterminal repeated sequence present on 

many human chromosomes" 

NUCLEIC ACIDS RESEARCH, 

vol., 18. 1990. OXFORD. GB, 

pages 6649-6657, XP002037026 

see the whole document, specially figure 3 

-/- 


115.117 



Fofm PCT/ISA/210 (continuation ol vacond shftot) (July 1992) 



page 2 of 4 



IN- ^^fcnONAL SEARCH REPORT / "^fc 



I PCT/US 97/05911 



C.(Contlmi«tlon) DOCUMENTS CONSIDERED TO BE RELEVANT 


Category " 


C«ation€»f<loouin«nt,wahmdioatioii,wh««winipri^ "; *: . ' " , 




A 


. WO 94 23049 A (THE JOHNS HOPKINS ■ 

UNIVERSITY) 13 October 1994 - U ' 

see the whole docutrent 


51-61, 
81-84"; 
96-98. 
110.111 


A 


EP 0 473 253 A (BIOLOGICAL RESEARCH CENTER 
OF THE HUNGARIAN ACADEMY OF SCIENCES) 4 . 
March 1992 

cited in the application 
see abstract 
see examples 1,3 


66-72. 
112 


A 


WO 95 32297 A (CANCER RESEARCH CAMPAIGN 
TECHNOLOGY LIMITED) 30 November 1995 

see abstract . 
see page 5-5 , 
see example 6 


51-65. 
73-84, 
99.100 


A 


A. SCHEDL ET AL.: "A method for 
generation of YAC transgenic mice by 
pronuclear microinjection" 
NUCLEIC ACIDS RESEARCH, 
vol. 21. no. 20. 1993. OXFORD. GB. 
pages 4783-4787. XP0O0616418 
cited in the application 
see the whole document 


51-62 


A 


WO 94 24300 A (TRANSGENE SA) 27 October 
1994 

see abstract 


3.37. 
119-122 


A : 


WO 92 17582 A (THE REGENTS OF THE 

UNIVERSITY OF MICHIGAN) 15 October 1992 

see page 11-19 

see abstract 

& US 5 240 840 A 

cited in the application 


13-32. 
67-72 


A ■ 


WO 95 29992 A (THE REGENTS OF THE 
UNIVERSITY OF MICHIGAN) 9 November 1995 
see page 12-14 


13-32. 
67-72 


P.X 


J. KERESO ET AL. : " De novo chromosome 
formations by large- scale amplification of 
the centromeric region of mouse 
chromosomes" 
CHROMOSOME RESEARCH, 

vol. 4. no. 3, 5 June 1996. OXFORD, GB, 
pages 226-239, XP002037022 

Sec LllC WnU 1 C UUCUIIICllL 


1-21. 
23-50. 
75.76. 
78. 

119-123 



Forni PCT/1SA«10 (oontinuatMn of saoond sheet) (.Aity 1992) 



page 3 of 4 



lONAL SEARCH REPORT 



C<ContiinMlion) DOCUMENTS CONSHOEREP TO BE RELEVANT 



mil li Applieatlon No 

PCT/US 97/05911 



Catogoiy* Clliition of <iaaiiinMit, wtth todiHUoi^whm ■pprapriate.af Ito 



P.X 



GY. HOLLO ET AL: "Evidence for a 

megarepi i con coveri ng megabases of 

centromerl c chromosome segments" 

CHROMOSOME RESEARCH. 

vol. 4, no. 3, 5 June 1996, OXFORD, GB, 

pages 240-247. XP0e2037023 

see the whole document 



Fomi PCT/ISAaiO (oentinijatian ol Moond shaal) (JUIy-19g2) 



Ralavant t9 ataim No. 



1-21. 

23-32. 

35-40 



page 4; of , 4 



7t 



lisr iflBPIONAL SEARCH REPORT 

iniornMUon on fwtMit famfly members 



. j| AppUeatton No 

PCT/US 97/05911 



Patsnt document 



Publication 



Patent ffamiiy 

0 , 



Publioation 
dale 



EP 0532050 


A 


17-03-93 


US 
CA 
JP 


5288625 A 
2078189 A , 
7177881 A 


22-02-94 
14-03-93 
18-07-95 


WO 9423049 


A 


13-10-94 


NONE 




EP 0473253 


A 


04-03-92 


CA 
JP 
US 


2042093 A 
6121685 A 
5712134 A . 


10-11-91 
06-05-94 
27-01-98 


WO 9532297 


A 


30-11-95 


AU 
ZA 


2534395 A 
9504300 A 


18-12-95 

nil A1 AC 

24-01-95 


WO 9424300 


A 


27-10-94 


FR. 
AU 
- CA 
EP 
JP 


2703996 A 
6571994 A 
2160697 A 
0694072 A 
8508878 T 


21-10-94 
08-11-94 
27-10-94 
31-01-96 
24-09-96 


WO 9217582 


A 


. 15-10-92 


US 


5240840 A 


31-08-93 


WO 9529992 


A 


09-11-95 


US 


5635376 A 


03-06-97 : 



f=oiin PCT/ISArtlO (patanl l«ni»r <Ai»y 199?) 



